For the sitemap to be usable by your visitors, you have to do two things:
You must place your sitemap in the site’s webspace, and configure the server to send it correctly.
Serve your sitemap as application/xml or text/xml. In the Apache HTTPd webserver, if your sitemap has the suffix .xml, you need:
AddType application/xml .xml
Sitemaps could contain a fair amount of metadata about every page in the site. But, as XML, they will also be quite repetitive (element/attribute names, whitespace), and should compress well. Feel free to use HTTP compression if your sitemap turns out big. Note, however, that compression is not required to serve a sitemap file.
Most modern browsers can receive compressed files over HTTP, and automatically decompress them when they arrive. HTTP provides two mechanisms for this:
The possibilties for using content encoding are:
It’s possible to get a server to compress files automatically. Apache does this through a 3rd-party module mod_gzip for gzip, and through a standard module mod_deflate for deflate. IIS has a built-in feature.
You may prefer to compress the files yourself, and store them on the server in compressed form, saving the server from the overhead of compression and the extra storage to cache the compressed version. You just need to tell the server that they should be served with an HTTP header field indicating that they are encoding for the purpose of end-to-end compression. In Apache:
AddEncoding gzip .gz
(This assumes that your pre-compressed files are identified by a conventional .gz suffix, but any appropriate suffix will do.)
Mozilla (Firefox), Opera and Safari will handle this. IE will also if it’s using HTTP/1.1 — check its settings both for direct and proxy connections. GoogleBot will also cope with it, as can careful Java applications.
If you’re doing precompression, but want greater compatibility with older browsers, and if you can have the server upgraded, the 3rd-party mod_gunzip module claims to be able to decompress files on-the-fly, if it detects that the client won’t be able to cope with it. This ought to be better than automatic compression, as decompression should have a smaller overhead than compression.
You need to hook the module into the processing of your pre-compressed files:
AddHandler send-gunzipped .gz
(Note that this is not particularly relevant for sitemaps at the moment, as this is meant to support IE, but we don’t have anything for IE to process sitemaps anyway.)
For an Apache2 solution, have a look at Trying to emulate mod_gunzip with Apache 2 Filters, especially among the comments.
While the sitemap format already has multilingual features, an alternative feature - provided by HTTP - can be exploited instead, namely “Content Negotiation”.
In this scheme, you write several versions of each page in your site in different languages, e.g. index.en.xml for English, index.de.html for German. You then tell your server what languages these pages are in:
# Apache HTTPd .htaccess AddLanguage en .en AddLanguage de .de
…and that they should be automatically selected according to the visitor's preference:
Now, if the URL index.html is accessed, the browser, the server, and the intervening proxies will co-operate to select a version (index.en.xml or index.de.html) most suitable for the visitor (according to how he has configured his browser’s language preferences).
This mechanism also works fine for sitemaps, so you can write sitemap.en.xml for the English sitemap, and sitemap.de.xml for the German one.