So, this blog (for its sins) is running on WordPress 2.0.5. That’s a bit out-of-date. The main reason is because it has all sorts of jiggery-pokery to make it work the way I want – a tagging solution based on Jerome’s Keywords that was modified when I moved to 2.0; all sorts of template hacking to make the beautiful breadcrumb trail at the top you see work.

I’ve resisted upgrading due to the hell that was hacking plugins and templates into future versions of WordPress. Until now, that is. WordPress 2.3 (finally) introduces a proper tagging solution – entirely separate to the “categories” system. Well, not quite, as we’ll see – but it finally means that the architecture of Infovore.org is now entirely possible within WordPress itself.

Of course, now you’ve got to convert your custom tagging solution to the new schema. I’ve written a small script to do this for myself – only took about an hour, and that’s mainly because I was exploring the schema, and my PHP is a little rusty. Of course, now I know a reasonable amount about how tagging is implemented in WordPress 2.3, and felt I should write this up properly, so that anybody else converting custom tagging solutions might save themselves some time.

There are three tables that comprise the WordPress 2.3 tagging solution: wp_terms, wp_term_taxonomy, and wp_term_relationships (assuming your WordPress prefix is wp_). They look like this:


wp_terms
+------------+--------------+------+-----+---------+----------------+
| Field      | Type         | Null | Key | Default | Extra          |
+------------+--------------+------+-----+---------+----------------+
| term_id    | bigint(20)   | NO   | PRI | NULL    | auto_increment | 
| name       | varchar(55)  | NO   |     |         |                | 
| slug       | varchar(200) | NO   | UNI |         |                | 
| term_group | bigint(10)   | NO   |     | 0       |                | 
+------------+--------------+------+-----+---------+----------------+

wp_term_taxonomy
+------------------+-------------+------+-----+---------+----------------+
| Field            | Type        | Null | Key | Default | Extra          |
+------------------+-------------+------+-----+---------+----------------+
| term_taxonomy_id | bigint(20)  | NO   | PRI | NULL    | auto_increment | 
| term_id          | bigint(20)  | NO   | MUL | 0       |                | 
| taxonomy         | varchar(32) | NO   |     |         |                | 
| description      | longtext    | NO   |     |         |                | 
| parent           | bigint(20)  | NO   |     | 0       |                | 
| count            | bigint(20)  | NO   |     | 0       |                | 
+------------------+-------------+------+-----+---------+----------------+

wp_term_relationships
+------------------+------------+------+-----+---------+-------+
| Field            | Type       | Null | Key | Default | Extra |
+------------------+------------+------+-----+---------+-------+
| object_id        | bigint(20) | NO   | PRI | 0       |       | 
| term_taxonomy_id | bigint(20) | NO   | PRI | 0       |       | 
+------------------+------------+------+-----+---------+-------+

Here’s the first thing you need to understand. These tables don’t just describe tags. They describe every kind of categorisation WordPress uses: tags, categories, link categories. That partly explains the complexity of the solution (although, personally, I think it’s over-engineered. More about that later).

So: wp_terms is the list of “terms” (be they tags, categories, link categories, or anything else that might be added in the future). There’s an auto-incremented id, the name of the term (eg “wordpress” or “blogging software” – term names can have spaces), a url-friendly slug (eg “wordpress”, “blogging-software” and a “term group” (which I haven’t found the necessity for yet).

wp_term_taxonomy describes where a particular term fits within the taxonomy. The taxonomy field defines the taxonomy the term belongs to – “post_tag”, “link_category”, etc. You can see that, despite the added complexity, this makes the system more future-proof. description and parent fields allow for a description of the taxonomy entry (as link categories used to have) and nested categories, respectively. The count column is worth a note. It’s basically a counter-cache – counts how many times this particular classification has been used. Yes, you can get this out of MySQL directly… but the way WordPress 2.3 is built, this also needs to be updated. It’s used, specifically, for the tag cloud views.

Finally, wp_term_relationships links a particular usage of a term within a taxonomy to a “thing”. This is my least favourite table in the schema, mainly because the object that object_id refers to is not defined in this table; it’s determined by looking up what the taxonomy in wp_term_taxonomy is.

Why does that annoy me? Well, it means you could have two entries in this table, both with the same object_id and term_taxonomy_id, except they refer to different objects, because their term_taxonomy.term_taxonomy is different. That seems a little clumsy to me.

Anyhow, that’s how it works. To give an example: a term could be used as both a post_tag for several posts or as a link_category for several blogrolls. In this schema, that would result in one wp_terms entry, two entries in wp_term_taxonomy, and many entries in wp_term_relationships.

Does that make sense? I hope so. Having reverse engineered all this, it wasn’t too hard to write a short PHP script to transform my old database (where tagging info was stored in wp_postmeta) into a 2.3 compatible schema. And now that’s done, my theme is a lot more generic, useful, and less prone to breaking in future versions.

Phew. That was pretty dense! I hope it makes sense. There are other explanations of this new functionality, some from the core team, but I felt it’s always worth making sure there are enough examples in the world. Also, my perspective is one of avoiding the internal APIs and plugin architectures: I’m just trying to write a one-off script to fix some things. I hope this was useful.

3 comments on this entry.

  • James | 10 Sep 2007

    I’ve built a lot (>15) of WP 2.1 and 2.2 sites over the last year… now waiting for the deluge of queries as past clients upgrade, break things, trip over, become confused, weep a little. Ta for pointing the way through this stuff before I had to!

  • kksid | 31 Jan 2008

    Hi,

    Thanks for this very useful post.

    You said that you are annoyed with table wp_term_relationships because

    “..
    it means you could have two entries in this table, both with the same object_id and term_taxonomy_id, except they refer to different objects, because their term_taxonomy_id.taxonomy is different..

    I think you meant “term_taxonomy.taxonomy” here.

    Anyways, did you mean to say that *both* the object_id as well as the term_taxonomy_id are the same as another record in the table?

    If it is so, I don’t think what you said is correct, because the pair of object_id and term_taxonomy_id constitute the compound primary key of wp_term_relationships which rules out their duplication.

    Having said that, yes I agree that it seems to be a pretty convoluted way of doing things.

  • Tom | 31 Jan 2008

    Yes, you’re right, regarding the table names. I’ll correct that now.

    As regarding the compound primary key… hmn. That might explain a few things, but either way, it feels convoluted. Thanks for pointing that out.