Back to Top

Better WordPress Google XML Sitemaps

Better WordPress Google XML Sitemaps

The first WordPress XML Sitemap plugin that comes with comprehensive support for Sitemapindex and Multi-site. Extend functionality via flexible modules, not just hooks!

 Sponsor   Themes by Elegant Themes - Unlimited access to all themes for $39!

Welcome to the first WordPress sitemap plugin that has support for both sitemapindex and Multi-site websites! You will no longer have to worry about the 50,000 URL limit or the time it takes for a sitemap to be generated. This plugin is fast, consumes much fewer resources and can be extended via your very own modules (yes, no hooks needed!).

Before moving on to the documentation, which can be rather long and boring, I think it’s better to show you the actual sitemapindex this plugin can generate ;).

Plugin Features

New in 1.2.0!

The long-awaited BWP GXS 1.2.0 has finally been released with a new and very powerful feature: Google News Sitemap creation! All you have to do is click on the newly added tab (News sitemap), enable the module, choose some news categories as well as their genres and you’re ready to go. Your news sitemap can also be used to ping Search Engines individually if you want. And of course, whenever you publish a new post in a news category, all selected Search Engines will be pinged!

Sitemapindex Support

What’s so great about a sitemapindex you might say? A Sitemap Index, as its name suggests, is one kind of sitemaps that allows you to group multiple sitemap files inside it. A Sitemap Index, therefore, gives you many benefits, such as: possibility to bypass the 50,000 URL limit (you can have 10 custom sitemaps, each has 10000 URLs), or possibility to make the generation time much faster (because each sitemap is requested separately and is built by its own module), etc.

Multi-site Support

Each website within your network will have its own sitemapindex and sitemaps. For sub-domain installation, your sitemapindex will appear at http://sub-domain.example.com/sitemapindex.xml. For sub-folder installation, your sitemapindex will appear at http://example.com/sub-folder/sitemapindex.xml. And of course, there’s always a sitemapindex for your main site, available at http://example.com/sitemapindex.xml. If you choose the sub-domain approach, each sub-domain can also have its own robots.txt. More on that in the Robots.txt section.

Custom sitemaps using modules

The unrivaled flexibility this plugin offers is the ability to define your custom sitemaps using modules. Each module is a actually .php file that tell BWP Google XML Sitemap how to build a sitemap file. You can extend default modules or create completely new ones. This plugin also comes with a convenient base class for developing modules with easy to use and thoroughly documented API. Since modules can be defined by you, there’s no limitation what a sitemap can have (for example you can bypass the 50,000 URL limit, as stated above). There’s one limitation, though: your imagination ;). Oh, did I mention that you can even use module to create another sitemapindex?

Detailed Log and Debugging mode

Developing modules needs debugging and this plugin makes that so easy for any developers. There are two kinds of logs: sitemap log and build log. Sitemap log tells you what sitemaps you have and when they were last requested/updated while build log tells you what’s going on when a particular sitemap is built. The debug mode helps you trace errors in modules conveniently!

Now for a more complete feature list

  • New in 1.1.0!
    • This plugin can automatically split large post sitemaps into smaller ones. You can set a limit for each small sitemap.
    • You now have an External Pages’ sitemap, using which you can easily add links to pages that do not belong to WordPress to the Sitemap Index.
    • Exclude certain post types, taxonomies without having to use filters.
    • Hooks to default post-based and taxonomy-based modules to allow easier SQL query customization (you don’t have to develop custom modules anymore just to change minor things).
  • By default, this plugin allows you to create a sitemapindex that contains the following sitemaps: posts (including custom post types), static pages, taxonomy archives (including custom taxonomies) and date archives. You can of course enable or disable any of them.
  • Provide all basic options for creating sitemaps, such as:
    • Maximum number of items per sitemaps
    • Default change frequency
    • Default priority
    • Minimum priority
  • Allows you to add the sitemap to WordPress’s virtual robots.txt. If you have a sub-domain Multi-site installation, each blog will have its own robots.txt. Of course this only works if you don’t have a physical robots.txt in the main site’s root.
  • Have full support for WPMU Domain Mapping plugin.
  • Allows you to style your sitemaps using a built-in XSLT style sheet, or custom-made ones.
  • Allows you to compress you sitemaps, thus making them approximately 70% smaller.
  • Allows you to ping search engines (Google, Bing, Yahoo, and Ask) when you:
    • Publish a new post
    • Publish a draft
    • Publish a pending post
    • Publish a scheduled (future) post
  • Other advanced features:
    • Allows you to cache the sitemap for a certain period of time. You can choose to automatically or manually generate new cache sitemaps.
    • SQL cycling: if you have a lot of URLs, e.g. 30000, in a single sitemap, it is recommended that you do not query for 30000 items in just one query as it will result in a very heavy one. We use SQL cycling to split such query into smaller queries, i.e. we will do 30 queries, with 1000 items queried each.
    • Modules: a module is actually a generator that tells this plugin how to build a sitemap. Module gives you the ultimate flexibility when you want to make custom sitemaps or sitemapindexes. This will be covered in great details in the Module API section.
    • To support modules, this plugin provides detailed logging system with a debug mode that helps you trace errors.

Plugin Usage

Basic Usage

Using this plugin is super easy, which basically requires two steps:

Step 1: Build your sitemap

After a successful activation (remember that you need WordPress 3.0 or higher for this plugin to work), you should see this in the configuration page:

BWP GXS Usage Step 1

Configuration Page

Now for the first time simply click on the only link you see in the ‘Your sitemaps’ section. You should now see the default sitemapindex with a simple yet nice XSLT style sheet attached to it. Your sitemapindex should look similar to this:

BWP GXS Sitemap Index

BWP GXS Sitemap Index

You can of course click on each individual sitemap to view its items, but maybe we should just leave that task to search engine bots, shouldn’t we ;)?

A custom post type Sitemap

A custom post type Sitemap

Step 2: Submit your sitemap

For all your sitemaps to be crawled by search engine bots, you only have to copy the URL (e.g. http://example.com/sitemapindex.xml) and paste it into your webmaster tool of choice. Follow the links in the ‘Submit your sitemaps’ section on the configuration page for the correct links to some major search engines’ submission tools.

Please note that your sitemapindex/sitemaps will be updated only when ‘something’ or ‘someone’ requests them. In other words, when you publish new/draft/future/pending posts you will not notice any slowdown at all, as this plugin will only notify (or ping) the search engines you specify so that they know you have just updated your website/blog. When those search engines actually download the sitemaps, they will be updated.

So now you have submitted your sitemaps, what now? Just sit back and relax, Google will crawl your sitemapindex and all other sitemaps inside that sitemapindex within hours. 90% of URLs on this website were indexed by Google on the third day it is online, just FYI ;).

Advanced Usage

Google News Sitemap

What is a Google News Sitemap? It is yet another sitemap file that allows you to control which content you submit to Google News. By creating and submitting a Google News Sitemap, you’re able to help Google News discover and crawl your site’s articles.

With this module, you have an option to either include or exclude posts in certain categories. Let’s say you have 4 categories (A, B, C, and D) in which A and D are the ones you want to use as news categories. You can select category A and D, and choose to include, or select category B and C, and choose to exclude, simple as that!

Each category can be assigned with five pre-defined genres, as per Google News’s rules (http://support.google.com/news/publisher/bin/answer.py? ... swer=93992). You can select none or all of them, it’s totally up to you.

Last but not least, it is possible to map some categories in your language to Google News’s suggested keywords in English. To do that, use the following filter:

  1. add_filter('bwp_gxs_news_keyword_map', 'bwp_gxs_news_keyword_map');
  2.  
  3. function bwp_gxs_news_keyword_map($map_array)
  4. {
  5.     $map_array = array(
  6.         // Use this structure: 'category in your language' => 'Google News suggested keyword in English'
  7.         '電視台' => 'television',
  8.         '名人'=> 'celebrities'
  9.     );
  10.     return $map_array;
  11. }
add_filter('bwp_gxs_news_keyword_map', 'bwp_gxs_news_keyword_map');

function bwp_gxs_news_keyword_map($map_array)
{
	$map_array = array(
		// Use this structure: 'category in your language' => 'Google News suggested keyword in English'
		'電視台' => 'television',
		'名人'=> 'celebrities'
	);
	return $map_array;
}
Robots.txt

(In case you don’t know what a robots.txt is, please have a look at this page.)

WordPress by default comes with a virtual robots.txt whose contents can be filtered by plugins or themes. BWP Google XML Sitemaps allows you to dynamically add a Sitemap: http://example.com/sitemapindex.xml entry to such file, thus allowing search engine crawlers to detect your sitemapindex automatically. If you, however, have a real robots.txt file in your website’s root, the sitemap entry won’t be added, so please keep that in mind.

If you’re on a Sub-domain Multi-site installation you will notice that each blog in your network can have its own robots.txt. So with the robots option enabled, if you browse to http://example.com/robots.txt, you will see something similar to this:

User-agent: *
Disallow:

Sitemap: http://example.com/sitemapindex.xml

and if you browse to http://sub-domain.example.com/robots.txt, you will see something like this:

User-agent: *
Disallow:

Sitemap: http://sub-domain.example.com/sitemapindex.xml

Cool, eh? Please note that there’s no http://example.com/sub-folder/robots.txt, so you won’t be able to add the sitemap entry dynamically in a Sub-folder Mult-site installation.

SQL Cycling

As mentioned above, it is better to query for items using a light query rather than a heavy one. This plugin comes with a feature called SQL cycling, which means we will do ‘small’ queries several times instead of doing a heavy query one time. The only option you can choose to interact with this is the number of items each cycle will query for. By default the number is 1000, but it can be increased to serve higher number of items. Since it’s not good to go over 30 queries either, if you have 50,000 items in one sitemap, you should change that number to at least 1600.

Cache control

All sitemaps are cached for a default period of one hour. When a sitemap is requested, this plugin will see if the cache is still valid, and will serve the cache if it is. Otherwise the sitemap will be re-generated, assuming that you tells this plugin to do so, by enabling ‘Enable auto cache re-generation’ (which should be enabled by default). If you disable such option, make sure you flush the cache once in a while.

The cache folder must be writable, i.e. you will have to CHMOD it to either 755 or 777. Don’t worry too much about security, though, a .htaccess is provided along with this plupgin to make sure your cache folder stay safe.

Logging and Debugging mode

It is recommended that you enable logging all the time because the logs will tell you in details how each sitemap was generated. All potential errors are also logged so that you can act immediately (such as problems with a specific module, 404 errors, etc.)

There’s a more advanced option called Enable Debugging, which is only recommended to experienced users. If debug is enabled, when fatal errors are encountered, this plugin will log the error messages and then print them on screen. In this mode, no cache file will be used, which makes debugging modules much easier. When you develop new modules, it is very important that you enable this option. Also note that there’s a huge chance you will encounter an error called ‘Content Encoding Error’ with debug on, this is caused by some PHP errors (such as ‘undefined index’) that were printed on screen, which corrupts the XML file’s header (all XML files are dynamically built, you know.) So, make sure the modules you develop are error-free ;).

Customization

Custom XSLT style sheet

The default XSLT style sheet should look okay in most cases, but if you’re in the (hopefully) rare case that requires a custom one, simply place an absolute URL in the ‘Custom XSLT stylesheet URL’ input, as shown below:

BWP Custom XSLT Style Sheet

BWP Custom XSLT Style Sheet

Please make sure that you also have an XSLT style sheet for the sitemapindex in the same place where the above custom style sheet is found. For example, if your custom XSLT URL looks like this: http://example.com/my-xslt.xsl then you must have my-xsltindex.xsl in http://example.com/, or in other words, http://example.com/my-xsltindex.xsl must be available, too. Otherwise, your sitemapindex will not be able to load because of the missing style sheet.

Exclude certain posts, terms, etc.

In version 1.1.0 more hooks have been added to default modules to allow easier customization of SQL queries used to build your sitemaps. For example, to exclude certain posts using ID, you can do this:

  1. add_filter('bwp_gxs_post_where', 'bwp_gxs_exclude_posts', 10, 2);
  2.  
  3. function bwp_gxs_exclude_posts($query_where_part, $post_type)
  4. {
  5.     // $post_type let you easily exclude posts from specific post types
  6.     switch ($post_type)
  7.     {
  8.         case 'post': return ' AND wposts.ID NOT IN (1,2,3,4) '; break; // the default post type
  9.         case 'movie': return ' AND wposts.ID NOT IN (5,6,7,8) '; break; // the 'movie' post type
  10.     }
  11.     return '';
  12. }
add_filter('bwp_gxs_post_where', 'bwp_gxs_exclude_posts', 10, 2);

function bwp_gxs_exclude_posts($query_where_part, $post_type)
{
	// $post_type let you easily exclude posts from specific post types
	switch ($post_type)
	{
		case 'post': return ' AND wposts.ID NOT IN (1,2,3,4) '; break; // the default post type
		case 'movie': return ' AND wposts.ID NOT IN (5,6,7,8) '; break; // the 'movie' post type
	}
	return '';
}

Remember to use wposts as the table alias and have some spaces before and after what you return, just to make sure it won’t corrupt the module’s query ;).

Similarly, to exclude terms from a specific taxonomy, you can basically do the same thing:

  1. add_filter('bwp_gxs_term_exclude', 'bwp_gxs_exclude_terms', 10, 2);
  2.  
  3. function bwp_gxs_exclude_posts($excluded, $taxonomy)
  4. {
  5.     // $taxonomy let you easily exclude terms from specific taxonomies
  6.     switch ($taxonomy)
  7.     {
  8.         case 'category': return array('cat-slug1', 'cat-slug2'); break;
  9.         case 'post_tag': return array('tag-slug1', 'tag-slug2'); break;
  10.     }
  11.     return array('');
  12. }
add_filter('bwp_gxs_term_exclude', 'bwp_gxs_exclude_terms', 10, 2);

function bwp_gxs_exclude_posts($excluded, $taxonomy)
{
	// $taxonomy let you easily exclude terms from specific taxonomies
	switch ($taxonomy)
	{
		case 'category': return array('cat-slug1', 'cat-slug2'); break;
		case 'post_tag': return array('tag-slug1', 'tag-slug2'); break;
	}
	return array('');
}

The only difference, as you can see above, is that you must return an array consisting of any term slug you would like to get rid of, instead of a part of query string in the post type example.

External Pages’ Sitemap

As of version 1.1.0, it will be easier for you to add external pages (links to pages from the same domain but not from WordPress) to the Sitemap Index. To do this, all you have to do is enable the External Pages’ sitemap in Sitemap Generator tab and then add this to your theme’s functions.php:

  1. add_filter('bwp_gxs_external_pages', 'my_external_sitemap');
  2.  
  3. function my_external_sitemap()
  4. {
  5.     $external_pages = array(
  6.         array('location' => home_url('link-to-page.html'), 'lastmod' => '06/02/2011', 'priority' => '1.0'),
  7.         array('location' => home_url('another-page.html'), 'lastmod' => '05/02/2011', 'priority' => '0.8')
  8.         // repeat this for any other pages you would like to add
  9.     );
  10.     return $external_pages;
  11. }
add_filter('bwp_gxs_external_pages', 'my_external_sitemap');

function my_external_sitemap()
{
	$external_pages = array(
		array('location' => home_url('link-to-page.html'), 'lastmod' => '06/02/2011', 'priority' => '1.0'),
		array('location' => home_url('another-page.html'), 'lastmod' => '05/02/2011', 'priority' => '0.8')
		// repeat this for any other pages you would like to add
	);
	return $external_pages;
}

You can set each page’s location, last modified date and priority. Change frequency will be calculated automatically using the last modified date you provide. By default, you will get something similar to this:

BWP GXS External Pages' Sitemap

BWP GXS External Pages' Sitemap

Make custom sitemaps using Module API

Before we start building a custom sitemap, let’s talk about how the module system works. Module is simply a .php file that contains a module class, which has a unique name, i.e. just one module for one sitemap. In the module class you can use all API functions provided by the base module (BWP_GXS_MODULE), assuming that you choose to extends1 it.

Using API functions you will be table to do SQL cycling, calculate priority, calculate change frequency, etc. but the body of the module class needs to be your own codes, i.e. you fetch the needed data in your own way and pass it back to this plugin to handle. It is therefore very easy to create a new module, because most of the time all you have to do is to change the SQL query that is used to get contents from database.

Now if you open the module folder in bwp-google-xml-sitemaps/includes/modules you will notice that each module’s filename is similar to its corresponding sitemap. For example post.php is used to build all post type based sitemaps and taxonomy.php is used to build all taxonomy based sitemaps. It’s important to understand that a module can be either a parent or a child module. A child module will be used first, and if it is not found, the parent module will be used instead. So if you have a custom sitemap named post_most_popular.xml, this plugin will request post_most_popular.php first, and if that fails, it will request post.php.

You might ask: “If I add more modules, what will happen when this plugin gets updated?” No worry, you have the option to choose a custom module folder that will take precedence over the module folder that comes with this plugin. Simply speaking, this plugin will look for modules in the custom folder first, and when it can’t find the requested module file, it will look for modules in the default one, in the exact same way described above.

Up to this point I believe that you have a better understanding of what module is all about and how it operates, how about we get to the real thing, now ;)?

Basic API functions

If you do not like the default sitemapindex, you can always add or remove sitemaps (or modules) from it. Adding or removing a module is easy, using two basic module API functions, namely add_module() and remove_module(), respectively, like so:

  1. add_action('bwp_gxs_modules_built', 'bwp_gxs_add_modules');
  2. function bwp_gxs_add_modules()
  3. {
  4.     global $bwp_gxs;
  5.     $bwp_gxs->add_module('post', 'most popular');
  6. }
add_action('bwp_gxs_modules_built', 'bwp_gxs_add_modules');
function bwp_gxs_add_modules()
{
	global $bwp_gxs;
	$bwp_gxs->add_module('post', 'most popular');
}

In the above snippet, I’m adding a new sub-module named ‘most popular’ to the built-in module ‘post’. Here’s a list of built-in modules:

post
page
taxonomy
archive

and built-in sub-modules:

archive_monthly
archive_yearly
taxonomy_category
taxonomy_post_tag

Please note that the name you use for any module can only contain alphanumeric characters plus hyphens, underscores, and spaces. Spaces will be converted to underscores anyway, so it’s best to just use underscores (spaces are still allowed because some people might find them easier to work with). So now you have a new module named post_most_popular, what are you supposed to do? Just make a custom module folder and then put a new module file named post_most_popular.php there, with your own codes, of course. For the sake of simplicity, post_most_popular is actually included with this plugin as a sample module, so after you add the module, you should see post_most_popular.xml‘s contents right away (FYI: it lists posts with at least 2 comments, ordered by comment_count and post_modified).

If you are adding a parent module, you will also have to add a new rewrite rule. For example if you want a parent module named ‘most_popular’ (instead of a sub-module like ‘post_most_popular’), you will also need this:

  1. add_filter('bwp_gxs_rewrite_rules', 'add_rewrite_rules');
  2. function add_rewrite_rules()
  3. {
  4.     $my_rules = array(
  5.         'popular\.xml' => 'index.php?gxs_module=most_popular'
  6.     );
  7.     return $my_rules;
  8. }
add_filter('bwp_gxs_rewrite_rules', 'add_rewrite_rules');
function add_rewrite_rules()
{
	$my_rules = array(
		'popular\.xml' => 'index.php?gxs_module=most_popular'
	);
	return $my_rules;
}

You can have popular.xml, or most_popular.xml, or anything you see fit. After you have added the above snippet, make sure you pay a visit to your Permalink Settings and click Save Changes so that WordPress will recognize your new rewrite rules. Otherwise, you will be greeted with a 404 error when you try to visit http://yourdomain.com/popular.xml.

Now if you want to remove a module, the process is similar:

  1. add_action('bwp_gxs_modules_built', 'bwp_gxs_remove_modules');
  2. function bwp_gxs_remove_modules()
  3. {
  4.     global $bwp_gxs;
  5.     // This will remove all modules that have 'taxonomy' as their parent
  6.     $bwp_gxs->remove_module('taxonomy');
  7.     // This will remove 'taxonomy_post_tags' only
  8.     $bwp_gxs->remove_module('taxonomy', 'post_tag');
  9. }
add_action('bwp_gxs_modules_built', 'bwp_gxs_remove_modules');
function bwp_gxs_remove_modules()
{
	global $bwp_gxs;
	// This will remove all modules that have 'taxonomy' as their parent
	$bwp_gxs->remove_module('taxonomy');
	// This will remove 'taxonomy_post_tags' only
	$bwp_gxs->remove_module('taxonomy', 'post_tag');
}

Keep in mind that your sitemap’s name, your module’s name and your module’s filename must be the same when you add a new module.

Advanced API functions

For this section I will take the module file post.php (which is documented rather thoroughly) as an example so you will be able to learn all the advanced API functions easily.

Basically, developing a module from scratch involves three steps:

Step 1: Initialize required properties for the module class

Before you can initialize anything, you must define your class, and the class’ name must start with BWP_GXS_MODULE_, followed by the module’s name. For example the module post will have BWP_GXS_MODULE_POST as its class name. And of course, to use the module API, you have to extend the base module class, like so:

  1. <?php
  2. /**
  3.  * Some info about you and the module here would be nice
  4.  */
  5.  
  6. class BWP_GXS_MODULE_POST extends BWP_GXS_MODULE {
  7.  
  8.     function __construct()
  9.     {
  10.     }
  11.  
  12.     function generate_data()
  13.     {
  14.     }
  15. }
  16. ?>
<?php
/**
 * Some info about you and the module here would be nice
 */

class BWP_GXS_MODULE_POST extends BWP_GXS_MODULE {

	function __construct()
	{
	}

	function generate_data()
	{
	}
}
?>

The body of your class is currently empty, but this is the expected structure for a module file.

Now it’s time to decide what this module will do, and what it will need to build data. The idea of this module is we will use it to display all kind of post type sitemaps, for example post.xml, post_movie.xml, etc. So what this module needs is the sub-module (to know what post type is being requested) and if sub-module is not found, we display the default post type, which is ‘post’. Easy to understand, eh?

For that reason, you might need a property named $requested, which will hold the currently requested post type. It is recommended that you assign values to all properties inside the __construct() function (a function that gets called when the module class is initialized). Also, in the __construct() function, it is a good idea to call $this->set_current_time();, which simply set $this->now to the current Unix timestamp. $this->now is used a great deal in calculating priority and change frequency, and will be set anyway if you don’t set it yourself.

To assign your property with the currently requested module, do this:

  1. function __construct()
  2. {
  3.     global $bwp_gxs;
  4.     $this->set_current_time();
  5.     $this->requested = $bwp_gxs->module_data['sub_module'];
  6. }
function __construct()
{
	global $bwp_gxs;
	$this->set_current_time();
	$this->requested = $bwp_gxs->module_data['sub_module'];
}

$bwp_gxs->module_data is explained clearly in the codes, just take a look if you’re curious about what that is. Otherwise, we move on to step 2.

Step 2: Build the actual data

After all necessary properties are assigned with expected values, always use $this->build_data() inside the construct function to start building your data. Next you will have to define your builder function, which can be either build_data() or generate_data(). Why the heck are there two similar functions for just one task? If you remember the SQL Cycling feature I talked about earlier, you will understand why we have two options here.

Simply speaking, the build_data() function ignores SQL Cycling while the generate_data() function allows you to make use of SQL Cycling. build_data() is recommended when you’re developing modules for sitemaps that does not have many items, which of course does not require SQL Cycling at all. generate_data() should be used in obviously opposite situations. Since there might be a lot of posts for a website/blog, for post.php we will use generate_data().

When you use generate_data() , you will have to query for posts using two DB API functions, namely $this->get_results() and $this->query_posts(). As you might have guessed, they’re no different than the two functions provided by WordPress: $wpdb->get_results()2 and query_posts()3. The same parameters and syntax are applied. Remember to always escape your query string with either $wpdb->escape()2 or $wpdb->prepare()2, as shown in the actual codes:

  1. // A standard custom query to fetch posts from database, sorted by their lastmod
  2. // You can use any type of queries for your modules
  3. $latest_post_query = '
  4.             SELECT * FROM ' . $wpdb->posts . "
  5.                 WHERE post_status = 'publish' AND post_type = %s" . '
  6.             ORDER BY post_modified DESC';
  7. // Use $this->get_results instead of $wpdb->get_results, remember to escape your query
  8. // using $wpdb->prepare or $wpdb->escape
  9. $latest_posts = $this->get_results($wpdb->prepare($latest_post_query, $requested));
// A standard custom query to fetch posts from database, sorted by their lastmod
// You can use any type of queries for your modules
$latest_post_query = '
			SELECT * FROM ' . $wpdb->posts . "
				WHERE post_status = 'publish' AND post_type = %s" . '
			ORDER BY post_modified DESC';
// Use $this->get_results instead of $wpdb->get_results, remember to escape your query
// using $wpdb->prepare or $wpdb->escape
$latest_posts = $this->get_results($wpdb->prepare($latest_post_query, $requested));

Now you’ve got the $latest_posts data set that contains all information about your posts. It would be pointless to continue the loop if the query returns nothing, so it is a good idea to have a simple check like below:

  1. // This check helps you stop the cycling sooner
  2. // It basically means if there is nothing to loop through anymore we return false so the cycling can stop.
  3. if (!isset($latest_posts) || 0 == sizeof($latest_posts))
  4.     return false;
// This check helps you stop the cycling sooner
// It basically means if there is nothing to loop through anymore we return false so the cycling can stop.
if (!isset($latest_posts) || 0 == sizeof($latest_posts))
	return false;

This snippet makes sure things are stopped correctly, and you won’t run into an endless loop somehow (should not happen, though).

Building each item is straightforward:

  1. // Always init your $data
  2. $data = array();
  3. for ($i = 0; $i < sizeof($latest_posts); $i++)
  4. {
  5.     $post = $latest_posts[$i];
  6.     // Init your $data with the previous item's data. This makes sure no item is mal-formed.
  7.     $data = $this->init_data($data);
  8.     $data['location'] = $this->get_permalink();
  9.     $data['lastmod'] = $this->format_lastmod(strtotime($post->post_modified));
  10.     $data['freq'] = $this->cal_frequency($post);
  11.     $data['priority'] = $this->cal_priority($post, $data['freq']);
  12.     $this->data[] = $data;
  13. }
// Always init your $data
$data = array();
for ($i = 0; $i < sizeof($latest_posts); $i++)
{
	$post = $latest_posts[$i];
	// Init your $data with the previous item's data. This makes sure no item is mal-formed.
	$data = $this->init_data($data);
	$data['location'] = $this->get_permalink();
	$data['lastmod'] = $this->format_lastmod(strtotime($post->post_modified));
	$data['freq'] = $this->cal_frequency($post);
	$data['priority'] = $this->cal_priority($post, $data['freq']);
	$this->data[] = $data;
}

$this->init_data() allows you to init the current item with previous item’s data (except for the location of course). This is to make sure we don’t miss any item. This function takes one parameter: $data.

$this->format_lastmod() allows you to format an integer Unix timestamp into a GMT timestamp that is supported by sitemaps. This function takes one parameter: an integer Unix timestamp, e.g. 1302902897.

$this->cal_frequency() allows you to calculate change frequency based on item’s last modified time. This function takes two parameters: $post object, and last modified date (optional, used only when you can’t have a $post object).

$this->cal_priority() allows you to calculate priority based on item’s freshness, comment count, and change frequency. This function takes two parameters: $post object, and the current item’s change frequency (should be $data['freq']).

Step 3: Pass the built data back

To pass the data you just build using your module back to the plugin, simply use $this->data[] = $data; at the end of each loop, like so:

  1. for ($i = 0; $i < sizeof($latest_posts); $i++)
  2. {
  3.     // ... build data
  4.     $this->data[] = $data;
  5. }
for ($i = 0; $i < sizeof($latest_posts); $i++)
{
	// ... build data
	$this->data[] = $data;
}

Since we’re still using SQL Cycling, you will have to add this at the end of generate_data():

  1. return true;
return true;

This tells the module to continue its cycling process. Otherwise, the module will only loop one time.

Step 4: (Surprised!) Visit the configuration page and browse to your newly created sitemap. Make sure you enable debug mode so that no errors are left out. If you encounter something like ‘XML Parsing Error’, or ‘Content Encoding Error’, just go to your module file and place exit; after $this->build_data();, like so:

  1. $this->build_data(); exit;
$this->build_data(); exit;

And then debug the module normally.

That’s it! Congratulations on your first module!

Create another sitemapindex

Creating a custom sitemapindex is similar to creating a custom sitemap. First you will have to add a new module, for example:

  1. add_action('bwp_gxs_modules_built', 'bwp_gxs_add_modules');
  2. function bwp_gxs_add_modules()
  3. {
  4.     global $bwp_gxs;
  5.     $bwp_gxs->add_module('mysitemapindex');
  6. }
add_action('bwp_gxs_modules_built', 'bwp_gxs_add_modules');
function bwp_gxs_add_modules()
{
	global $bwp_gxs;
	$bwp_gxs->add_module('mysitemapindex');
}

Then, similar to adding a parent module, you must add a new rewrite rule, like so:

  1. add_filter('bwp_gxs_rewrite_rules', 'add_rewrite_rules');
  2. function add_rewrite_rules()
  3. {
  4.     $my_rules = array(
  5.         'mysitemapindex\.xml' => 'index.php?gxs_module=mysitemapindex'
  6.     );
  7.  
  8.     return $my_rules;
  9. }
add_filter('bwp_gxs_rewrite_rules', 'add_rewrite_rules');
function add_rewrite_rules()
{
	$my_rules = array(
		'mysitemapindex\.xml' => 'index.php?gxs_module=mysitemapindex'
	);

	return $my_rules;
}

Make sure you flush all rewrite rules by visiting Settings → Permalinks, and press Save Changes.

Next, in mysitemapindex.php, you will need to add one more line to the construct function:

  1. function __construct()
  2. {
  3.     $this->type = 'index';
  4.     // place other properties
  5.     // ...
  6.     $this->build_data();
  7. }
function __construct()
{
	$this->type = 'index';
	// place other properties
	// ...
	$this->build_data();
}

Since a sitemapindex’s item does not need priority or change frequency, in your builder function, be it build_data() or generate_data(), make sure you use something like this:

  1. $data = array();
  2. foreach ($items as $item)
  3. {
  4.     $data = $this->init_data($data);
  5.     $data['location'] = $this->get_xml_link($slug);
  6.     $data['lastmod'] = $this->format_lastmod($int_timestamp);
  7.     $this->data[] = $data;
  8. }
$data = array();
foreach ($items as $item)
{
	$data = $this->init_data($data);
	$data['location'] = $this->get_xml_link($slug);
	$data['lastmod'] = $this->format_lastmod($int_timestamp);
	$this->data[] = $data;
}

$this->get_xml_link() is yet another API function that will get the correct sitemap URL for you. It accepts one parameter: the sitemap slug, which is expected to be the same as your module’s name (i.e. ‘mysitemapindex’ in the above example).

Now try browsing to http://example.com/mysitemapindex.xml and you should see your new sitemapindex ready to be crawled!

Other Notes

URLs to all generated sitemaps are affected by the current permalink settings on your website/blog. If you don’t use pretty permalinks, your sitemaps’ URLs will be similar to this http://example.com/?bwpsitemap=module. For example your sitemapindex will become http://example.com/?bwpsitemap=sitemapindex. Search engines can crawl such URLs just fine, and this plugin should be able to change all sitemap URLs for you every time you change your permalink setting. Please note that, however, cached sitemaps won’t be changed when you change permalink settings, you will have to manually flush the cache, or simply wait for them to be refreshed.

In addition, if you don’t like the word bwpsitemap, you can change that using a filter. Refer to the Hook References section for more details.

Known Issues

All sitemaps are dynamically generated so there are no actual sitemaps created in your website’s root (except for those in the cache folder). This could lead to a very common error called ‘Content Encoding Error’, i.e. the content of an xml sitemap is corrupted. This might be caused by minor bugs from this plugin itself or bugs from other plugins or themes. If you encounter such error, please visit the FAQ section for some possible solutions.

To-do List

  • Add Image sitemap (1.2.3)
  • Add VIdeo sitemap (1.2.3)
  • Review X-Robots tags (1.2.2)
  • Support custom taxonomies for news sitemap (1.3.x)

Hook References

  • bwp_gxs_query_var_non_perma – Used to change the default bwpsitemap query var when pretty permalink is not set. (filter)
  • bwp_gxs_xslt – Used to define the custom XSLT style sheet’s URL (filter)
  • bwp_gxs_module_dir – Used to define the custom module folder (filter)
  • bwp_gxs_module_mapping – Used to map a module to another module, for example you can map post_format to post_tag. This will be explained in more details later. (filter)
  • bwp_gxs_rewrite_rules – Used to define your own rewrite rules. This should be used when you add a custom sitemapindex. Example above. (filter)
  • bwp_gxs_post_where – Allows you to filter the ‘where’ part in post modules’ queries. (filter, additional variable: $post_type – the currently requested post_type)
  • bwp_gxs_term_exclude – Allows you to exclude certain terms for a specific taxonomy. (filter, additional variable: $taxonomy – the currently requested taxonomy)
  • bwp_gxs_freq – Allows you to use your own algorithm to calculate change frequency. (filter, additional variable: $post object)
  • bwp_gxs_priority_score – Allows you to use your own algorithm to calculate priority. (filter, additional variables: $post object, $freq – calculated change frequency)
  • bwp_gxs_news_name - Allows you to set a custom sitename for your news sitemap without having to change the sitename setting inside WordPress (filter).
  • bwp_gxs_news_keyword_map – Allows you to map your categories in your language to Google News’s suggested categories in English (filter).
  • bwp_gxs_modules_built – Fire after all default modules are defined. Use this to add or remove modules from the default sitemapindex. (action)

Contribute to this Plugin

This plugin is licensed under GPL version 3, and it needs contributions from the community.

Buy me some special coffees!

My plugins and support for them are free. If you like my work and could buy me some (special) coffees, I would be much appreciated! They might help with some overnight times debugging my plugins, you know.

Module Submission

You can help the development of this plugin by either:

  • Make a cool module and submit it!
  • Improve a default module and submit it (if you know how to use Git, also check the Git Repository below.)

Support, Feedback, and Code Improvement

i18n (Translate the plugin)

If you are a translator, please help translating this plugin. Even if you aren't, you can become one, it is very easy and fun! If you want to know how, please read here: Create a .pot or .po File using Poedit.

References

  1. http://php.net/manual/en/keyword.extends.php []
  2. http://codex.wordpress.org/Function_Reference/wpdb_Clas ... wpdb_Class [] [] []
  3. http://codex.wordpress.org/Function_Reference/query_pos ... uery_posts []
Elegant Themes - Designed with Modest Elegance
Print Article Watch Log