Prevent google from indexing pdf files


(Hosein Naseri) #1

Is there anyway to prevent search engines like google from indexing pdf files? I know that we can do something like

<Files ~ "\.pdf$">
  Header set X-Robots-Tag "noindex, nofollow"
</Files>

in htaccess for normal websites. But how this is possible for discourse?


(Jay Pfaffman) #2

If would take a plugin.


(Evgeny) #3

Customize the configuration: nginx
Adding to where you want to be: robots.txt

example

server {
	listen 80;
	server_name sites.com;
	location = /robots.txt {
		root /var/discourse;
		access_log off;
		expires max;
		break;
	}
	location / {
		proxy_set_header X-Real-IP $remote_addr;
		proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
		proxy_set_header Host $http_host;
		proxy_redirect off;
		proxy_pass http://sites_com;
	}
}

And in the file robots.txt
add

Disallow: /*.pdf