On a recent project, we had to create multiple sitemaps for each of the domains that we have setup on the site. We came across some problems that we had to resolve because of the nature of our pURL setup.
Goals
-
We want all of the front pages from each subdomain to be added to the sitemap and we are able to set the rules for them on the XMLSitemap settings page.
-
We want to make sure that the URLs that we are adding to the other pages no longer show up in the main domain’s sitemap.
Problems
1) Only On The Primary Domain
The XML sitemap module only creates one sitemap based on the primary domain.
2) Prefixes not Distinguished
Our URLs for nodes are setup so that nodes can be prefixed with our subdomain (pURL modifier) and XMLSitemap doesn’t see our prefixes as being different sites. At this point, all nodes are added to every single domain’s sitemap.
3) URL Formats
Our URLs are not in the correct format when being added to the sitemap. Our URLs should look like http://subdomain.domain.org/*, however, because we are prefixing them, they show up as http://domain.org/subdomain/*. We want our URLs to look like they are from the right sub-domain and not all coming from the base domain.
Solution
We were able to add the ability to create sitemaps for each of the 15 domains by adding the XMLSitemap domain module. The XLMSitemap domain module allows us to define a domain for each sitemap, generate a sitemap and serve it on the correct domain.
We added xmlsitemap-dont-write-empty-element-in-xml-sitemap-file-2545050-3.patch to prevent empty elements from being added to the sitemap.
Then we used a xmlsitemap_element_alter inside of our own custom module that looks something like this:
<?php
/**
* Implements hook_xmlsitemap_element_alter().
*/
function hook_xmlsitemap_element_alter(array &$element, array $link, $sitemap) {
$domain = $sitemap->uri['options']['base_url'];
$url_parts = explode('//', $domain);
$parts = explode('.', $url_parts[1]);
$subdomain = array_shift($parts);
$current_parts = explode('/', $link['loc']);
$current_prefix = array_shift($current_parts);
$modifiers = _get_core_modifiers();
//Checks to see if we are on a valid subdomain from our pURL modifiers
if (in_array($subdomain, array_keys($modifiers))) {
//Checks to see if we are not on the correct subdomain
//and that we do have a prefix (fixes front page)
if ($current_prefix != $subdomain && $current_prefix != '') {
//Empty out the element
$element = array();
return $element;
}
else {
//Our subdomain matches our prefix, build our correct url
$pattern = $current_prefix . '/';
$element['loc'] = $domain . str_replace($pattern, '', $link['loc']);
}
}
else {
//We are on our main domain, remove elements from
//prefixes that are subdomains
if (in_array($current_prefix, array_keys($modifiers))) {
$element = array();
return $element;
}
}
}
/**
* Helper function for getting the subdomains from the database cache
*/
function _get_core_modifiers() {
if (!$cache = cache_get('subdomains')) {
$result = db_query("SELECT id, value FROM {purl} WHERE provider = 'og_purl_provider'")->fetchAllAssoc('value');
cache_set('subdomains', $result, 'cache', time() + 86400);
return $result;
}
else {
return $cache->data;
}
?>
If you have any questions, suggestions, feel free to drop a comment below!