A few months ago, I got to report that Google’s sitemap protocol had been adopted by Yahoo and MSN. Well, it looks like the sitemap protocol has taken another giant step forward with the announcements of robots.txt autodiscovery and support on Ask.com.
Why is this so important? One word: Visibility. Search engines have been plagued with visibility issues throughout their existence. Webmasters would make web pages and wonder why the search engines didn’t include those pages in their indices. At the same time, search engines would encounter hurtles to indexing based on how webmasters put their sites together. Thus was the practice of search engine optimization born.
The sitemap protocol allows a webmaster to provide a well-defined list of URLs in a format that search engines can easily read. It’s the communication link that’s been missing from web standards for ages. Webmasters can rest easy knowing that search engines can see and index (although not necessarily rank) every page on their site. Meanwhile, search engines can improve their indices with pages that they may not have known about previously. Everybody wins.
The new autodiscovery feature makes it even simpler to communicate with search engines. It uses a preexisting web standard, the robots.txt file, to define where your sitemap is for any search engine spider that wanders by. Once upon a time, you had to manually submit your sitemap to each search engine. And while it’s still a good idea to do so (some of them give you some great metrics if you do), it’s not necessary to ensuring that your pages are crawled.
The syntax is simple enough. Just add an extra line to your robots.txt file that looks like this:
It’s that simple. All things considered, I don’t imagine it will be long before any naysayers (if there even are any) are sold on the sitemap protocol. Get it up now if you haven’t already implemented it. Seriously, what are you waiting for? I already did.