Blogging, Robots.txt and the Supplemental Index
By jason on May 18, 2007 in SEO
A minor detail overlooked by most bloggers including myself is proper setup of your robots.txt file with your blog.
Here’s the situation suddenly a lot of your blog posts and indexed pages appear in the supplemental index for no reason. It’s all original content so why is it in the supplemental index?
It turns out Google also indexes the feeds of your blog and causes real pages to go supplemental. Do a site: search on your site and limit it to rss feeds or keyword feed and you should see what I mean just like I did pretty quickly.
THe easy solution is to use robots.txt and to take advantage of Googlebot’s ability to handle wild card characters adding the following to your robots.txt file
User-agent:Googlebot
Disallow: /*/feed/$
Disallow: /*/feed/rss/$
Disallow: /*/trackback/$
This got me thinking what else am I missing from my robot.txt file and what are the bigger more successful blogs utilizing? Here’s the trick: Go to any blog you follow or like and enter the url/robots.txt and that file will appear on your screen for analyzing and inspection. Once you understand how the file works you’ll see how other sites are using their robots.txt file and hopefully incorporate some of the tricks of the trade into your site
Technorati Tags: blogger, google, seo, southwest seo












You must be logged in to post a comment.