SEO Site Architecture, continued

Yesterday’s intro to site architecture and it’s association with SEO began with a discussion of organization. And that’s a great place to start. You’ve got all your content planned with a lovely little flow chart. You’ve got boxes and circles and lots of connector lines. Cool! Now you all you have to do is figure out how you’re going to get it onto web pages.

Web publication strategies are many and various. They can, however, be broken into two main groups: static and dynamic.

In static web publishing, every page is created individually, typically by hand. First, a design is created, either for the page itself (often for home pages) or as a template (for inside pages). Then a developer starts putting in the content. Every image is set in place. Every page element is laid down according to the design; every link, every word of text. And that’s the way the page stays, until somebody deliberately steps in to change it. This strategy can be very time-consuming to manage in large websites, and sites built like this tend to stay, well, static. The plus side will be discussed a little further down.

Dynamic pages are all sorts of different. Dynamic pages are designed as templates, with some, most, or all of the content broken into discrete “areas.” Each area then has code placed that will fill those areas with whatever content is indicated. All of that content lives at the server in a database. The content data can be updated manually or automatically. Some content areas might have different content rotated in and out on a schedule (daily, hourly, whatever) while other areas might update every time the browser is refreshed. Or not updated at all. Today, most large sites are dynamic, and virtually ever retail site is done that way. That’s because if you have 10,000 products that are always in a state of change you simply cannot keep up with a static site. Dynamic sites can be managed by a database administrator (DBA), or more often, by lower-paid staff using a content management system (CMS). Whether you use a CMS, or pay a DBA, dynamic content can be wonderful. It can take most of the suffering out of content maintenance. But there is a dark side.

The problem with dynamic website content—and the advantage of using a static page strategy—lies with an artifact of the resulting site architecture. Cue the scary music (duh duh duuuuuuuh): the Gobblety-gook URL.

When a browser renders a page for a person (or spider) to view, the file and folder structure leaves its imprint in the URL. Suppose you’re looking for a page selling Hyram’s 8 oz. Unflavored Spruce Oil. You go to HyrumsOils.com and in the URL bar you may see http://www.hyrumsoils.com/index.html.

In a static web page, any file that lives right in the root folder will show up right after the .com (slash). Index.html is a file, of course. Next you click on the link to Spruce Oils. Now, the URL looks like http://www.hyrumsoils.com/spruceoils/spruce.html. You still have your root folder, and now you also have another folder, “spruceoils,” and another file, “spruce.html.” Click on “Unflavored Oils” and of course what you’ll see is

(example 1)
http://www.hyrumsoils.com/spruceoil/unflavored/unflavspruce.html

http://www.hyrumsoils.com/oakoil/flavored/mintyoak.html

And etcetera.

And that’s how every website ever built would look if simplicity ruled the world. It doesn’t. Instead, you have complicated dynamic database-driven sites (that you also don’t need to understand). These sites speak their own language as they communicate back and forth with the database, the CMS, and the browser. This language is most decidedly NOT English.

They instead throw out URLs that might make the address of the above page look more like:
(example 2)
http://www.hyrumsoils.com?sort=asdfdk&stuffID=129854&crap=oilfarmlks093898&none

http://www.hyrumsoils.com?sort=asdfdk&stuffID=832938&crap=oilfarmlks093899&some

That IS something you should care about. Because in example 1, a spider can look at the URL and see that your page is about something that is spruce oil and unflavored. Which will give the destination page a unique presence, a clear place int he universe, and a boost for search terms like “unflavored spruce oil.” In example 2, nobody can tell wtf is going on.

But it’s worse than that. Because spiders don’t read the stuff that comes after a “?” In both examples, the URLs point the same two pages. So if your URLs look like example 2, and spiders don’t read after the “?,” to a spider those two page addresses look like:

http://www.hyrumsoils.com
http://www.hyrumsoils.com

And that’s going to present some problems when it comes to indexing them.

Now, you don’t really need to reduce every website to super-simple root/folder/file constructions. That would be boring, time consuming, expensive, and probably impossible for any site over 100 or so pages.

What you should do is insist that your website, however it works, resolves URLs to something that makes sense and uses useful keywords. The most useful CMSs have features built in that make it easy to convert the gobblety-gook of example 2 to the simple joy of example 1. If you’re not using a CMS, the server software you use probably still has a method to fix the problem (Apache Server uses the dreaded “Mod Re-write”). Your web development team will either know what you mean and make it happen, or they won’t. If they don’t you might reconsider hiring them.

Tomorrow, we’ll look at the final SEO-relevant architectural element, navigation.

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30

eugene seo