Archive for May 5th, 2011

05
May
11

Technorati

EQ8UTB6JZMSY

That’s the secret code Technorati wants me to publish to prove I am

IN CONTROL

of this blog. Well, I am. Sort of.

05
May
11

10 Step SEO # 10: Sitemaps

Okay, then. We’re closing out our series on 10 Step SEO with something that a lot of folks neglect, disparage, or misunderstand: the lowly sitemap.

Just like the name says, a sitemap is a map to a web site. Simple. Really, just a list of links that point to all the pages.   There are two kinds of sitemap we’ll discuss here: HTML and XML. Today, we’ll focus on HTML sitemaps; next Thursday, we’ll get into the more exotic XML variety.

HTML sitemaps are placed on a regular web page and typically linked from the home page. Conceivably, this sort of reference could prove useful for people who are looking for specific information on a large or complex site. And in fact, up to 25% of internet users used to rely on sitemaps at least some of the time to find content.  I say “used to” because that number hit its height in about 2002. Since then, there has been a steady decline in sitemap use to around 7% in 2008 to its current level of  something somewhat less. So if nobody’s really using your HTML sitemap, why do  you need it?

The answer is, of course, search marketing. Search spiders aren’t very smart (as we’ve noted here before). They have trouble following certain kinds of links and reading some sorts of link text. Sometimes they get trapped in loops they can’t get out of. Sometimes they index vast numbers of dynamically generated pages that don’t really exist. Sometimes they skip entire sections of a site. An HTML sitemap—properly designed—provides an easy set of pathways into the site for spiders to follow.

And by properly designed, we mean that HTML sitemaps should:

  • be made of nothing but text links.
  • contain no links other than the map links (no need for normal page navigation here).
  • be built on a logical structure (one example follows):

  • contain no more than 50 links per page*—if you have more links than that, you should separate your sitemaps into multiple levels. For instance, you can make sitemap_1 with links for category and sub category only and with additional links to sub-category/product pages. Or, you can break them into multiple pages based on a simple alphabetical sort. Or whatever. Just be sure to link multiple sitemaps to each other.
  • be prominently linked from the home page. Sitemap links in the footer are okay as long as there isn’t much content above it on the page. We prefer linking to sitemap from above the main header whenever possible.
  • be kept up to date. Larger sites should consider investing in scripts or other technology to automate their sitemaps. Generating them dynamically will ensure that the links are always current.
  • be linked from every indexable page on the site. If a spider comes into your site for the first time from somewhere in the deep pages, this will help it crawl back up the structure to find the rest of them.

* Reason: some spiders will only follow and index a set number of links per page, always starting from the first they encounter. This number is different for different search engines, but 50 seems pretty safe. This is also the reason to place your sitemap link at the top of the page. If your homepage has 50+ links on it before you get to the sitemap, some engines may never see it.

Next week we’ll end this mess once and for all with a discussion of the mysterious and elusive XML Sitemap Protocol.

Sources

Jakob Neilson’s Alert Box (Jan. 6, 2002)
Jakob Neilson’s Alert Box (Sept. 2, 2008)
Sitemap Useability (2008, PDF)
The Right Way to Think ab0ut Sitemaps (Aug 9, 2007)