For the sitemap to be usable by your visitors, you have to do two things:

  1. Publish the sitemap in your webspace.
  2. Link to the sitemap from your pages.

Publishing

You must place your sitemap in the site’s webspace, and configure the server to send it correctly.

MIME type

Serve your sitemap as application/xml or text/xml. In the Apache HTTPd webserver, if your sitemap has the suffix .xml, you need:

AddType application/xml .xml

Compression

Sitemaps could contain a fair amount of metadata about every page in the site. But, as XML, they will also be quite repetitive (element/attribute names, whitespace), and should compress well. Feel free to use HTTP compression if your sitemap turns out big. Note, however, that compression is not required to serve a sitemap file.

Most modern browsers can receive compressed files over HTTP, and automatically decompress them when they arrive. HTTP provides two mechanisms for this:

Transfer encoding
Transfer encoding is applied hop-by-hop (for example, if there are proxies). The first hop could be compressed, then the next uncompressed, then the next compressed again, but by a different compression scheme. This mechanism is really intended for software to make decisions about.
Content encoding
Content encoding is normally applied end-to-end (compressed at the server, decompressed at the client — although there’s nothing to stop proxies interfering, except that they’re “not supposed to”). This is the mechanism which you should configure as a user of a server, as it is known to work. gzip, deflate and compress are encodings recognised by HTTP. gzip is most widely supported.

The possibilties for using content encoding are:

automatic compression
The server detects whether the client can handle compression, and compresses it on-the-fly. The server might also cache the compressed version for later.
precompression
The files are published to the server in compressed form, and the server is told to indicate that they are already compressed.
automatic decompression
The same as precompression, but the server also decompresses automatically for clients which can’t handle it.

Automatic compression

It’s possible to get a server to compress files automatically. Apache does this through a 3rd-party module mod_gzip for gzip, and through a standard module mod_deflate for deflate. IIS has a built-in feature.

Precompression

You may prefer to compress the files yourself, and store them on the server in compressed form, saving the server from the overhead of compression and the extra storage to cache the compressed version. You just need to tell the server that they should be served with an HTTP header field indicating that they are encoding for the purpose of end-to-end compression. In Apache:

AddEncoding gzip .gz

(This assumes that your pre-compressed files are identified by a conventional .gz suffix, but any appropriate suffix will do.)

Mozilla (Firefox), Opera and Safari will handle this. IE will also if it’s using HTTP/1.1 — check its settings both for direct and proxy connections. GoogleBot will also cope with it, as can careful Java applications.

Automatic decompression

If you’re doing precompression, but want greater compatibility with older browsers, and if you can have the server upgraded, the 3rd-party mod_gunzip module claims to be able to decompress files on-the-fly, if it detects that the client won’t be able to cope with it. This ought to be better than automatic compression, as decompression should have a smaller overhead than compression.

You need to hook the module into the processing of your pre-compressed files:

AddHandler send-gunzipped .gz

(Note that this is not particularly relevant for sitemaps at the moment, as this is meant to support IE, but we don’t have anything for IE to process sitemaps anyway.)

For an Apache2 solution, have a look at Trying to emulate mod_gunzip with Apache 2 Filters, especially among the comments.

Multiple languages

While the sitemap format already has multilingual features, an alternative feature - provided by HTTP - can be exploited instead, namely “Content Negotiation”.

In this scheme, you write several versions of each page in your site in different languages, e.g. index.en.xml for English, index.de.html for German. You then tell your server what languages these pages are in:

# Apache HTTPd .htaccess
AddLanguage en .en
AddLanguage de .de

…and that they should be automatically selected according to the visitor's preference:

Options +MultiViews

Now, if the URL index.html is accessed, the browser, the server, and the intervening proxies will co-operate to select a version (index.en.xml or index.de.html) most suitable for the visitor (according to how he has configured his browser’s language preferences).

This mechanism also works fine for sitemaps, so you can write sitemap.en.xml for the English sitemap, and sitemap.de.xml for the German one.

Linking the sitemap in

When someone visits your site, his browser normally just fetches a page from it and displays it. For him to see your sitemap, you must also tell the browser where to fetch it from, as it is fetching the page. There are two ways to do this:

  • Via HTML — Add some lines to the <head> section of each page. This can be tedious if you edit your pages manually, but should be relatively easy for sites built from some sort of template.
  • Via HTTP — Add an HTTP header field to every page you serve. This is very convenient if your web server supports it, not only because you don’t have to edit all your pages, but because the sitemap will be available on all pages in your site, not just HTML ones.