@Turtle, I intentionally put this out there as a general tip for users rather than a specific answer to the podcast question. Mostly because finding the best possible solution for @bpequine involves a further conversation about her needs in regards to her specific question, current skill set, current stacks/plugins owned, and current budget.
However, since you brought it up…
-
Page Safe comes with 3 stacks;
- Page Safe, protects the entire page
- Stack Safe, controls whether or not a stack is displayed or not based on whether or not a Page Safe stack has ben unlocked
- Logout, allows any button to be used as a logout button for Page Safe
- With both of the “Safe” stacks the content is only ever pulled from the server if/when a password has successfully been entered. This would indicate that crawlers would be unable to access this information regardless of whether or not they adhere to the “no follow” standards.
-
Browsers do cache information, however I think it highly unlikely that this information makes it’s way onto the web since this is intended for local use on a per device basis. We would have to look up the terms and conditions of the browser to know for certain. In any case you can prevent browser caching.
-
You are right that you may not want your entire site blocked from indexing. SEO Helper can be used on a page by page basis or in a partial to be uniform across pages. So the choice is yours.
-
Not all robots may abide by 4.0.1+ standards, but I’m certain that google and other major search engines do. From the sound of it her primary concern isn’t hackers or looking to steal information, but rather google simply doing what it does best, indexing.
-
Blocking robots from seeing the resources folder should definitely be an option from within RW. I’m actually surprised this hasn’t come up before. I know that @joeworkman’s Total CMS protects the cms-content folder from robots. So If you store the file using Total CMS you should be good to go without any technical setup. @nikf, is this something that would be easy enough to add in to a small update?
-
Good point about the search engine having already indexed it. I know that google does have resources available for you to petition to have content removed, but they don’t make it all that easy.
-
The password protection of the file is a clever solution.
-
RapidBot 2 is a great product and was my first into into robot.txt files.
-
I think the most you could do to protect your resources is to store them in a secure location off of the server like on Amazon S3. Then you could use a file delivery system such as Rapid Cart Pro or Cartloom to deliver a download link to users that have access to a page protected by Page Safe. This would send users an email with a unique link to download the file. Preventing anyone from knowing the true location of the file. It would also alert you anytime someone downloads that file.
-
Yup, can’t stop someone from republishing the list if they have it, but that isn’t the scenario that she is experiencing. She is concerned about search engines i.e. Google
-
I agree that a PDF may not actually be the best solution. I could see some reasons why it would be a better medium. Although, I think that in most cases using something like @joeworkman’s [Power Grid Stacks + Page Safe/Stack Safe + an optional use of Easy CMS/Total CMS for online editing of the table] would be a better solution.
-
Great thoughts and I’m sure that some or all of this will be useful to someone. Hopefully @bpequine!
Cheers!!
Brandon