I received a request to add a robots.txt file in order to insure that an individual (who was mentioned in a court case) is not picked up by search engines. Even though this is a public record, he doesn’t have anything to do with the reasons for the case and I do not want to get involved in the extraterritorial application of EU law, so I have agreed to exclude the page from the search engines. It will still be on the court’s web site.
My question concerns the syntax necessary to exclude a page.
The URL facing the public is along the lines of:
http://mywebsite.com/documents/courtcase.pdf
But on the server, the structure is:
/html/mywebsite/documents/courtcase.pdf [note absence of “.com”]
In creating the robots.txt file, you are supposed to begin the URL with a “/”, so which is correct?
(a) /mywebsite.com/documents/courtcase.pdf or
(b) /html/mywebsite/documents/courtcase.pdf
Server is MediaTemple gridserver, if it makes a difference.
A similar question was asked before on the forum but the links are all dead.