Who can tell me how to set up RapidBot so that it allows access only to the main index page (HOME page) and disallows access to everything else in the site? The site in question has a lot of folders and pages, so it would be impractical to disallow every single one of them. Surely, there should be a way to lump them up together, somehow?
If not using RapidBot, how to write a code for
robots.txt file that would do the same?
DOn’t know a thing about rapidbot.
Might try this :
Assuming index.php is your home page.
Havn’t tried it so make sure you test:
Disallow: / line of code override the
Allow: /index.php line?
No it should not the way I understand it. Robots.txt files are processed top down.
If it hits an explicide allow it should pass.
I have used simular robots.txt files with sub directories(never tried in the home directory) like this and it works:
As I said you should test it.
- update your robots.txt file
- go to the URL above (log onto google if needed)
- select test robots.txt file
- You should see your robots.txt file
- If not or you have updated select the submit button (lower right)
- submit button (on the pop-up)
- refresh page until changes appear
- enter test URL at bottom.
Thanks again, Doug. This is very helpful.