Overview

Enhancing your site with an SSP sitemap involves two steps:

  1. Authoring — Write a sitemap file in XML.
  2. Deployment — Ensure that your visitors’ browsers can find it.

If you have problems, check the troubleshooting advice.

You should keep in mind that different visitors might see your sitemap differently:

The different software packages that visitors use are referred to as implementations of the Standard-Sitemap Protocol, and the behaviour varying across them is said to be implementation-defined.

Sitemap-generating software

Links for potential developers

WordPress looks like the best bet to start with. It’s widely used, open-source, and seems to have a native navigation-hierarchy system. Therefore, that system is probably already widely used, and available to any plug-in we write for it.

It will likely be more difficult to extend a CMS that provides a hierarchy only through a particular extension. That means that the hierarchy is specified by the user as configuration of that extension, and other rival extensions would also have their own independent configurations. We’d want our extension to access the same configuration as what the user has already written, and we wouldn’t want to have to write one sitemap extension per existing navigation extension.

Webmaster FAQ

Why not look for a sitemap file at a fixed location on each site, like /sitemap.xml?

This presumes that every virtual host provides a sitemap at /sitemap.xml. If the user agent were to visit a host where it is not provided, it would leave a 404 entry in that site’s access log. Now scale that up to thousands or millions of user agents doing that routinely, for a URI that the host never claimed to serve.

Also consider the trend set by a decision of one application to stake a claim on /sitemap.xml. Developers on other applications would feel they had the right to stake claims on other names, and would have to co-ordinate to avoid clashes. Furthermore, we’d all be intruding on the namespaces of every webserver. Webserver administrators have not agreed to any such name reservation, so it would be presumptuous of us to adopt a mechanism that foisted it upon them.

What names are allowed for the sitemap file?

Any – you state the URL in the <link> tag or HTTP header anyway. :-)

How do I write URLs containing an ampersand ‘&’?

The sitemap is written in XML, so you need to escape & as &amp;. For example:

http://www.example.com/page.php?content=products&page=2

…becomes:

http://www.example.com/page.php?content=products&amp;page=2

Should I enter error pages (404.html etc.) into the sitemap?

No. The sitemap should only contain pages that the user wants so see intentionally.

May I enter the same page/URL more than once?

Yes, there’s nothing stopping you if the very same page belongs to different categories. However, rather than having two separate <item>s, you can define one, and then reference it from another location:

<item name="Businesses">
  <item xml:id="shops" name="Shops" />
</item>

<item name="Places to visit">
  <external url="#shops" />
</item>

This is important if the item in question has subitems of its own. They will be available at both places in the sitemap.

Note that such ‘trees with overlapping branches’ can make certain forms of navigation unclear. For example, there are now two paths from the ‘Shops’ page up to the top, so which one should be taken when displaying breadcrumbs?

Must the top level be a single <item>?

No. If you do not see your website as having a single superordinate page to which all others ‘belong’, you can put several entries in the sitemap XML file as direct children of the root <sitemap> element. See the Google sitemap example.

What’s the difference between <item> and <group>?

<group> is used to logically separate several <item>s from the others. It can have a name, but not a URI. Like an <item>, it can have a relation, but without introducing a new hierarchy level.

Different implementations can choose to reflect this in different ways. Our Firefox extension does this in its sidebar:

  • An <item> is a clickable entry in the Standard-Sitemap Navigator sidebar. If it has no url, it just opens and closes, to reveal or hide the contained entries.
  • If a <group> features a "name", it will be shown as a header above its contained <item>s.
  • A <group> is always expanded; the user cannot collapse it.

In the menu, groups are simply separated from each other in the same submenu.

A rule of thumb, if you’re converting in-band navigation into a sitemap, is to use <item> for each link in your navigation bar(s), and use <group> for each navigation bar (if you have multiple).

My website features multiple language versions, but the structure is not identical for all languages. What can I do?

For example, you can use:

<item> <!-- home page -->
  <variant lang="en"  />
  <variant lang="de"  />
  <item lang="en" name="Software"  />
  <item lang="de" name="Hardware"  />
</item>

In this case, the home page is available in English and German; the Software page is available in English only; the Hardware page in German only.

If your website is huge and you don’t want to find out which page corresponds to which (in the other languages), you may also treat the different languages as different "roots". We strongly advise against it, because the user will not be able to switch directly to the same page in another language. It is possible though:

<item lang="en"> <!-- english home page -->
   <!-- thousands of <item>s -->
</item>
<item lang="de"> <!-- german home page -->
   <!-- thousands of <item>s -->
</item>

However, the recommended method is:

<item> <!-- home page -->
  <variant lang="en"  />
  <variant lang="de"  />
  <item> <!-- first sub-page -->
    <variant lang="en"  />
    <variant lang="de"  />
  </item>
   <!-- thousands of further <item>s with their <variant>s -->
</item>

Does it matter where I place the role items in the hierarchy?

Yes, to some degree, as the item must be reached at some stage to be recognised. All items which are children of the root element will be reached, as will any child of a reached item, including those referenced by <external>.

Before I added an XML sitemap, all my pages already contained an extensive in-band hierarchy of navigation links. If I get rid of them, visitors without sitemap-aware browsers will get lost. How do I support visitors both with and without sitemap capability?

Use stylesheets to hide the in-band navigation if, and only if, the sitemap is displayed. Add this standard idiom for class-based styling to the top level of your sitemap file:

<class-change
   xmlns:html="http://www.w3.org/1999/xhtml"
   elem="/html:html/html:body"
   attr="class" prefix="sitemap" />

If your in-band navigation is wrapped up like this (for example):

<div class="nav">
  Your existing navigation system here
</div>

…you can add styles to disable the in-band navigation when the sitemap is displayed:

body.sitemap-over-29 .nav {
  display: none;
}

Don’t add sitemap-over-29 to your HTML yourself. If the visitor’s browser includes sitemap software, it will add classes like that for you. If the browser doesn’t understand sitemaps, no such classes will be added, so your in-band navigation will not be hidden.

How do I represent my Web application?

That you have to figure out on your own!