robots.txt which folders to disallow - SEO?

I am currently writing my robots.txt file and have some trouble deciding whether I should allow or disallow some folders for SEO purposes.

Here are the folders I have:

  • /css/ (css)
  • /js/ (javascript)
  • /img/ (images i use for the website)
  • /php/ (PHP which will return a blank page such as for example checkemail.php which checks an email address or register.php which puts data into a SQL database and sends an email)
  • /error/ (my error 401,403,404,406,500 html pages)
  • /include/ (header.html and footer.html I include)

I was thinking about disallowing only the PHP pages and let the rest.

What do you think?

Thanks a lot

Laurent

Comments


  • Faith

    /css and /js -- CSS and Javascript files will probably be crawled by googlebot whether or not you have them in robots.txt. Google uses them to render your pages for site preview. Google has asked nicely that you not put them in robots.txt.

    /img -- Googlebot may crawl this even when in robots.txt the same way as CSS and Javascript. Putting your images in robots.txt generally prevents them from being indexed in Google image search. Google image search may be a source of visitors to your site so you may wish to be indexed there.

    /php -- sounds like you don't want spiders hitting the urls that perform actions. Good call to use robots.txt

    /error -- If your site is set up correctly the spiders will probably never know what directory your error pages are served from. They generally get served at the url that has the error and the spider never sees their actual url. This isn't the case if you redirect to them, which isn't recommended practice anyway. As such, I would say there is no need to put them in robots.txt

Add Comment