Sitemap Protocol Continues to Grow
A few months ago, I got to report that Google’s sitemap protocol had been adopted by Yahoo and MSN. Well, it looks like the sitemap protocol has taken another giant step forward with the announcements of robots.txt autodiscovery and support on Ask.com.
Why is this so important? One word: Visibility. Search engines have been plagued with visibility issues throughout their existence. Webmasters would make web pages and wonder why the search engines didn’t include those pages in their indices. At the same time, search engines would encounter hurtles to indexing based on how webmasters put their sites together. Thus was the practice of search engine optimization born.
The sitemap protocol allows a webmaster to provide a well-defined list of URLs in a format that search engines can easily read. It’s the communication link that’s been missing from web standards for ages. Webmasters can rest easy knowing that search engines can see and index (although not necessarily rank) every page on their site. Meanwhile, search engines can improve their indices with pages that they may not have known about previously. Everybody wins.
The new autodiscovery feature makes it even simpler to communicate with search engines. It uses a preexisting web standard, the robots.txt file, to define where your sitemap is for any search engine spider that wanders by. Once upon a time, you had to manually submit your sitemap to each search engine. And while it’s still a good idea to do so (some of them give you some great metrics if you do), it’s not necessary to ensuring that your pages are crawled.
The syntax is simple enough. Just add an extra line to your robots.txt file that looks like this:
Sitemap: http://www.yourdomain.com/
It’s that simple. All things considered, I don’t imagine it will be long before any naysayers (if there even are any) are sold on the sitemap protocol. Get it up now if you haven’t already implemented it. Seriously, what are you waiting for? I already did.
October 9th, 2007 at 6:52 am
[...] Obviously, you won’t see much here until after you’ve added a sitemap. If you’ve got a WordPress blog, the sitemap generator plugin is a quick way to automate the process. Once you’ve added your sitemap, you can see when it was submitted, when it was downloaded, and whether or not there were any errors reading it. Don’t forget to add a sitemaps autodiscovery line to your robots.txt to make sure other search engines can find it also. [...]
October 9th, 2007 at 4:57 pm
[...] Obviously, you won’t see much here until after you’ve added a sitemap. If you’ve got a WordPress blog, the sitemap generator plugin is a quick way to automate the process. Once you’ve added your sitemap, you can see when it was submitted, when it was downloaded, and whether or not there were any errors reading it. Don’t forget to add a sitemaps autodiscovery line to your robots.txt to make sure other search engines can find it also. [...]
October 9th, 2007 at 7:27 pm
[...] Obviously, you won’t see much here until after you’ve added a sitemap. If you’ve got a WordPress blog, the sitemap generator plugin is a quick way to automate the process. Once you’ve added your sitemap, you can see when it was submitted, when it was downloaded, and whether or not there were any errors reading it. Don’t forget to add a sitemaps autodiscovery line to your robots.txt to make sure other search engines can find it also. [...]
October 11th, 2007 at 11:59 pm
[...] Obviously, you won’t see much here until after you’ve added a sitemap. If you’ve got a WordPress blog, the sitemap generator plugin is a quick way to automate the process. Once you’ve added your sitemap, you can see when it was submitted, when it was downloaded, and whether or not there were any errors reading it. Don’t forget to add a sitemaps autodiscovery line to your robots.txt to make sure other search engines can find it also. [...]
June 5th, 2008 at 9:20 am
[...] Submit the site to search engines. This used to mean manual submissions. Nowadays, though, it means compiling an XML sitemap and submitting it to Google Webmaster Tools, Yahoo Site Explorer, and MSN Live Search Webmaster Center, as well as adding it to your robots.txt file for smaller search engines. This should get the spiders indexing all of your content as quickly as possible. [...]