Insertion

The extension inserts itself into the browser chrome with the conventional chrome.manifest file. This merges invariant content from the chrome/content folder, locale-specific content from the chrome/locale folder, and skin content from the chrome/skin folder, all under the URI hierarchy chrome://standardsitemap/. The modules directory is loaded under chrome://modules.standard-sitemap.org/, and the other components will likely be renamed to fit in with this later.

The extension then overlays the browser XUL with two of its own files. overlay.xul adds some libraries to every window, plus menu items, shortcuts, icons and buttons. monitor.xul adds a sitemap monitor to every window. (Hmm, maybe this should just be merged into overlay.xul now.)

Shared components

The structure of the extension is designed to permit multiple renditions of a sitemap independently for the same tab. One part of the current implementation injects a sidebar into page content, and another provides a pop-up menu. Functionality that is common to both renditions is abstracted into a library of utility functions, a library for parsing sitemaps, and a monitor for tracking which sitemap currently should be in effect.

Library sslib.jsm

sslib.jsm is a collection of functions useful in various places within the sitemap, e.g. building URI objects, storing preferences.

There is only one instance of the library, rather than one per window. You can get hold of it with:

Components.utils.import("resource://modules.standard-sitemap.org/sslib.jsm");
var lib = Application.storage.get("org.standard-sitemap.library", null)

Do the import once at the start of your script. As a code module, it is safe to import it multiple times, e.g. once from each script.

Sitemap service

The file service.jsm defines some classes for obtaining sitemap XML files, and parsing them into a node hierarchy.

The StandardSitemapService class is instantiated once per application, and can be obtained through:

monitor.getService()  // See below to get monitor.

This resource is also available directly as a code module:

Components.utils.import("resource://modules.standard-sitemap.org/service.jsm");
var service = Application.storage.get("org.standard-sitemap.service", null);

…but you may as well get it from the monitor for the window you’re dealing with.

A rendering component should only need to read instances of classes StandardSitemapService.Sitemap StandardSitemapService.Sitemap.Node and StandardSitemapService.Sitemap.VariantData. It should never need to create them or cache them, so the module does not expose such things.

From a sitemap object, you can get an array of language codes sported by the sitemap:

var array = sitemap.GetLanguages();

This gives you the sitemap’s location as an nsIURI:

var location = sitemap.oSitemapURI;

This gives you the anonymous root sitemap node:

var oNode = sitemap.oSitemapNode;

You can get an array of nodes matching a URI with this:

var array = sitemap.GetNodesForURI(someString);

Role nodes

You can get various role nodes like this:

var search = sitemap.oSearch;
var home = sitemap.oHome;
var contact = sitemap.oContact;
var info = sitemap.oContentInfo;

If you have a search-data template pattern containing (say) q=%s, you can expand it with a search term term (and other macros supported by the data attribute), with:

var queryString = sitemap.ReplaceFields(pattern, term);

Nodes

A sitemap node contains variants and other attributes, and holds a number of child nodes. You can detect whether the node comes from a group or an item:

if (node.bIsGroup) {
  // The node is a group, not an item.
}

… and get other node qualities:

var role = node.sRole;
var relation = node.sRelation;

You can get all the child nodes of the sitemap like this:

for each (var child in node.aChildNodes) {
  ...
}

For convenience, these three properties are also provided:

var node;

if (node.HasChildren) {
  // node.aChildNodes is not empty.
}

// Get the first child or null.
var first = node.FirstChild;

// Get the last child or null.
var last = node.LastChild;

You can iterate over a node’s parents with:

node.mParent.foreach(function(parent) { ... });

You can also iterate over its previous/next nodes with:

node.mPrev.foreach(function(parent, previous) { ... });
node.mNext.foreach(function(parent, next) { ... });

Variants

On a node, you can get its qualities for a given language preference with:

var langs = [ 'de', 'en-GB', 'en' ];
var content = node.GetContent(langs);

content['name'] // the node's ‘name’ quality
content['desc'] // the node's ‘description’ quality
content['addr'] // the node's ‘location’ quality (nsIURI)
content['searchdata'] // the node's ‘search template’ quality
content['searchmethod'] // the node's ‘search method’ quality
content['lang'] // the node's ‘language’ quality
content['format'] // the node's ‘content type’ quality

You can also obtain these more directly:

node.GetName(langs)
node.GetDescription(langs)
node.GetURI(langs) // an nsIURI
node.GetURISpec(langs) // a string

Monitor

The sitemap monitor’s job is to determine which sitemap serves the current page. It keeps sitemaps cached, and merges display-level settings from different rendering components.

There is one monitor per window, accessed with:

var monitor = window["org.standard-sitemap.monitor"];

Each tab has an agent, which is accessible from the monitor:

var agent = monitor.getAgentForTab(someTab);
// or
var agent = monitor.getAgentForBrowser(someTab.getBrowserForTab(someTab));

Agents

An agent is associated with each tab. It keeps track of which sitemap is to be used for the tab’s current page.

agent.getLocation() gives the URI (an nsIURI) of the current page, as far as the agent is aware.

agent.getSitemapURI() gives the URI (an nsIURI) of the current sitemap.

You can get a StandardSitemapService.Sitemap object when it is available:

var f = function(sitemap) { ... };
agent.getSitemap(f);

You can be notified when a change occurs:

var f = function(bReload, sReason) { ... };
agent.addListener(f);
.
.
.
agent.removeListener(f);

bReload indicates that the user is forcing a reload. sReason gives the kind of notification:

location-change

The user has navigated to another URI. This might simply be a change within the current page, so this gives the listener the chance to update a display to reflect this, even if no further events happen.

in-site-change

The tab is now displaying a new page, but it is still in the same site, i.e. the the sitemap URI has not changed.

sitemap-change

The sitemap has changed as a result of some navigation. It should be obtained again with agent.getSitemap(callback).

Rendering components should report to the agent of the tab whose sitemap they are rendering, so that the agent can update the styles used on the current page according to how persistently the sitemap is displayed (Conditional styling):

agent.setDisplayLevel("sidebar", visible ? 50 : sitemapAvailable ? 25 : 0);

An identifier (sidebar in this example) must be given to each rendering component, so that the agent can distinguish callers. The highest setting provided by all callers is applied to the page.

A future version of this function will allow the rendering component to specify the display level for separate portions of the sitemap, e.g. tree, roles, both.

Rendering components

Sidebar

sidebar.xul and sidebar.js define the content and behaviour of the sidebar.

standardsitemapsidebar.xml injects sidebar.xul into the body the page. When overlay.js is told about a new tab, it sets the URL of the sidebar’s content to sidebar.xul, but appends a fragment identifier which is actually the XML id of the notificationbox representing the tab that encloses the sidebar and the page contents. When the sidebar is loaded, sidebar.js extracts the fragment identifier, and uses it to find the notificationbox, and ultimately the browser object for the tab.

Alternative code exists for Firefox 3.5.3 and later, but is disabled. In this scheme, overlay.js knows the panel id and has a reference to the browser for the tab it is being notified of, and simply sets them as user data of the <sidebar> element, which sidebar.js simply reads back.(It’s about time we used this and dropped the other mechanism.)

In either case, the sidebar object created by the script adds itself as user data of the browser, under the name org.standard-sitemap.sidebar.

Menu

menu.js implements a pop-up menu on the customizable-toolbar icon, which itself is specified by overlay.xul.

Auto-subgrouping algorithm

Auto-subgrouping is performed on certain lists of items that have already declared to be in lexical order. For example, if you have about 900 items in alphabetical order, they can be split into 30 items (a screenful, say), each containing 30 items. The names of the first level of items can be derived from names of the first and last subitems. How are the subgroup sizes determined?

First, we need the total number of items, N, and the challenge, C, the maximum menu size. If we’re asked to fit in a smaller number than the challenge, it’s no challenge at all, and we just throw every one in. We might have to use more than two levels to meet the challenge, so a recursive algorithm is in order, and we can use NC as the terminating condition.

Now we get an upper bound on the depth required. We repeatedly divide N by C until we get within C. This is like taking the base-C logarithm of N, but we’re careful to round up at each stage. We start with a depth d of 1, and increment for each division.

If we used C as the number of groups at the top level, and so on with each sublevel but the last, the last level could end up with an incongruent number of items per submenu, something much less than C. Instead of splitting the items into C top-level groups, we compute G, the dth root of N, and round up. (This should never be more than C.)

With G groups, we need N′=N÷G items per group on average. We’ll split this into N0=⌊N′⌋ and N1=⌈N′⌉. If we put just N0 into each subgroup, we will probably not get all items in, so we make some of the middle subgroups have N1 items.

The name of each group is then derived from the names of the first and last items assigned to it. For example, if the first and last items are called Fred and Jim, the group can be called Fred…Jim. However, to keep these names short, we compare (say) Fred to the name of the last item of the previous group (say, Fran). These differ at the third character, so we can afford to shorten Fred to Fre. If the item after Jim is John, it can be shortened to Ji. The group is then named Fre…Ji.

Finally, we reapply subgrouping to each group recursively, which will stop if we’ve already met the challenge.