Enhancing your site with an SSP sitemap involves two steps:
If you have problems, check the troubleshooting advice.
You should keep in mind that different visitors might see your sitemap differently:
The different software packages that visitors use are referred to as implementations of the Standard-Sitemap Protocol, and the behaviour varying across them is said to be implementation-defined.
WordPress looks like the best bet to start with. It’s widely used, open-source, and seems to have a native navigation-hierarchy system. Therefore, that system is probably already widely used, and available to any plug-in we write for it.
It will likely be more difficult to extend a CMS that provides a hierarchy only through a particular extension. That means that the hierarchy is specified by the user as configuration of that extension, and other rival extensions would also have their own independent configurations. We’d want our extension to access the same configuration as what the user has already written, and we wouldn’t want to have to write one sitemap extension per existing navigation extension.
How do I specify the URL of my SSP sitemap?
Choose one of these two methods:
<link>
inside the HTML file
Declare this in the <head>
of each HTML file:
<link rel="schema.stdmap" href="http://standard-sitemap.org/2007/ns"> <link rel="stdmap.location" href="/sitemap.xml">
Send out these two HTTP header fields with each file:
Opt: "http://standard-sitemap.org/2007/ns"; ns=15 15-Location: /sitemap.xml
In Apache’s .htaccess, use:
LoadModule headers_module modules/mod_headers.so Header set Opt "\"http://standard-sitemap.org/2007/ns\"; ns=15" Header set 15-Location /sitemap.xml
In PHP, use:
<?php header('Opt: "http://standard-sitemap.org/2007/ns"; ns=15'); header('15-Location: /sitemap.xml'); ?>
You can, of course, adjust various strings as necessary. You don’t have to use stdmap or 15 as the namespace prefixes in each case; just choose names that don’t clash with anything you’re already using.
Why not look for a sitemap file at a fixed location on each site, like /sitemap.xml?
This presumes that every virtual host provides a sitemap at /sitemap.xml. If the user agent were to visit a host where it is not provided, it would leave a 404 entry in that site’s access log. Now scale that up to thousands or millions of user agents doing that routinely, for a URI that the host never claimed to serve.
Also consider the trend set by a decision of one application to stake a claim on /sitemap.xml. Developers on other applications would feel they had the right to stake claims on other names, and would have to co-ordinate to avoid clashes. Furthermore, we’d all be intruding on the namespaces of every webserver. Webserver administrators have not agreed to any such name reservation, so it would be presumptuous of us to adopt a mechanism that foisted it upon them.
What names are allowed for the sitemap file?
Any – you state the URL in the <link>
tag or HTTP header anyway. :-)
How do I write URLs containing an ampersand ‘&’?
The sitemap is written in XML, so you need to escape &
as &
. For example:
http://www.example.com/page.php?content=products&page=2
…becomes:
http://www.example.com/page.php?content=products&page=2
Should I enter error pages (404.html etc.) into the sitemap?
No. The sitemap should only contain pages that the user wants so see intentionally.
May I enter the same page/URL more than once?
Yes, there’s nothing stopping you if the very same page belongs to different categories. However, rather than having two separate <item>
s, you can define one, and then reference it from another location:
<item name="Businesses"> <item xml:id="shops" name="Shops" /> </item> <item name="Places to visit"> <external url="#shops" /> </item>
This is important if the item in question has subitems of its own. They will be available at both places in the sitemap.
Note that such ‘trees with overlapping branches’ can make certain forms of navigation unclear. For example, there are now two paths from the ‘Shops’ page up to the top, so which one should be taken when displaying breadcrumbs?
Must the top level be a single
<item>
?
No. If you do not see your website as having a single superordinate page to which all others ‘belong’, you can put several entries in the sitemap XML file as direct children of the root <sitemap>
element. See the Google sitemap example.
<group>
is used to logically separate several <item>
s from the others. It can have a name, but not a URI. Like an <item>
, it can have a relation
, but without introducing a new hierarchy level.
Different implementations can choose to reflect this in different ways. Our Firefox extension does this in its sidebar:
<item>
is a clickable entry in the Standard-Sitemap Navigator sidebar. If it has no url
, it just opens and closes, to reveal or hide the contained entries.<group>
features a "name", it will be shown as a header above its contained <item>
s.<group>
is always expanded; the user cannot collapse it.In the menu, groups are simply separated from each other in the same submenu.
A rule of thumb, if you’re converting in-band navigation into a sitemap, is to use <item>
for each link in your navigation bar(s), and use <group>
for each navigation bar (if you have multiple).
My website features multiple language versions, but the structure is not identical for all languages. What can I do?
For example, you can use:
<item> <!-- home page --> <variant lang="en" … /> <variant lang="de" … /> <item lang="en" name="Software" … /> <item lang="de" name="Hardware" … /> </item>
In this case, the home page is available in English and German; the Software page is available in English only; the Hardware page in German only.
If your website is huge and you don’t want to find out which page corresponds to which (in the other languages), you may also treat the different languages as different "roots". We strongly advise against it, because the user will not be able to switch directly to the same page in another language. It is possible though:
<item lang="en"> <!-- english home page --> … <!-- thousands of <item>s --> </item> <item lang="de"> <!-- german home page --> … <!-- thousands of <item>s --> </item>
However, the recommended method is:
<item> <!-- home page --> <variant lang="en" … /> <variant lang="de" … /> <item> <!-- first sub-page --> <variant lang="en" … /> <variant lang="de" … /> </item> … <!-- thousands of further <item>s with their <variant>s --> </item>
Does it matter where I place the role items in the hierarchy?
How do I represent my Web application?
That you have to figure out on your own!