WordPress Query Overview: How a Page Request Is Translated To a MySQL Query

This article comes to you courtesy of Ozh, one of our favorites in the WordPress community. Read a little bit about him below and enjoy this excellent tutorial.

Ozh has been using and hacking with WordPress since may 2004 on version 1.0.1, and released his first WordPress plugin more than 6 years ago. His passion for plugins recently shaped up in writing a dedicated book with Brad Williams and Justin Tadlock: Professional WordPress Plugin Development will hit the shelves in March 2011. If you like this article, you will love the book!

WordPress Query Overview: How a Page Request Is Translated To a MySQL Query

Displaying a page on a WordPress site is a complicated task. Oh, not from the user point of view, unless typing in an URL and clicking enter (or even following a link) seems complicated, of course. But think about the poor little server getting whipped with that simple order: “Display the page http://mysite.com/2010/12/24/merry-xmas/!”. The problem is, the server speaks only PHP and MySQL, so finding that particular post will be a difficult job. Let’s find out how difficult it’ll be!

WordPress “rewrites” URL

When a client (ie a user’s browser) requests a page on your WordPress powered site, some black magic occurs: the URL http://mysite.com/2010/12/24/merry-xmas/ doesn’t match an actual path on the web server (there’s no /2010 folder with a /12 sub-directory and so on), so at some point it means that the requested URL has been rewritten by the server.

You may know it already, WordPress URLs have ugly equivalent forms, like http://mysite.com/?p=1337. While being more programmatically obvious (we easily guess that this page will be rendered fetching post with ID 1337), these URLs are less search engine and user friendly. (if you happen to be completely new on this subject, read the Codex article about Pretty Permalinks right now)

Let’s dissect how the rewriting magic occurs in WordPress, from a page request to a proper MySQL query.

One file to serve them all

When you install WordPress, it creates in the root folder an essential file in the rewriting process: the .htaccess file. This file consists of a rewrite directive that will basically tell the server the following: if the request %{REQUEST_FILENAME} is not a file (line 6, !-f), if the request is also not a directory (line 7, !-d), then redirect it to index.php and let it manage everything (line 8, which will be the last rule followed)

So, once that index.php starts working, what happens?

The WordPress Include Flow

The file index.php does little actually: it defines a constant stating that we’ll be using a theme, then starts the flow of file includes. Reading the source from that point is an interesting read: it explains how WordPress instantiates everything that will be eventually needed and which variables and constants are defined along the way.

The initial include sequence is simple and straightforward: index.phpwp-blog-header.phpwp-load.phpwp-config.phpwp-settings.php

In wp-settings.php serious things can start: it defines a bunch of, you guessed it, settings, includes all files except the pluggable functions from the wp-includes directory, includes all the active plugins and then loads the pluggable functions.

At the end of of wp-settings.php, WordPress is almost fully loaded (it still needs the theme parts) but it doesn’t know yet what to display.

Back to wp-blog-header.php: the function wp(), wrapper for WP::main(), starts the parsing process, with WP:parse_request().

Parsing the request

The function parse_request() will first fetch all the rewrite rules that are registered, calling function $wp_rewrite->wp_rewrite_rules(). Enters the mighty and often scared Rewrite API.

The Rewrite API defines rules as a list of pattern => replacement pairs, such as for instance the following one: [([0-9]{4})/([0-9]{1,2})/([^/]+)(/[0-9]+)?/?$] => index.php?year=$matches[1]&monthnum=$matches[2]&name=$matches[3]&page=$matches[4]

This rather cryptic pattern will match the following sequence: 4 digits (part 1 in red), a slash, 1 or 2 digits (part 2 in blue), a slash, some letters (part 3 in purple), and then maybe another slash with more digits (part 4 in violet), and why not a slash at the end (part 5 in grey). Hey, something like 2010/12/merry-xmas/ will actually match this, lucky us!

Once that WordPress knows all the rewrite rules, it will match each one against the URL that is currently requested till there is a match and when there is one, WordPress is able to fill the blanks in the replacement part of the rewrite rule. If there is no match, a 404 error will be triggered.

In our example, WordPress now understands that the request is translated to index.php?year=2010&monthnum=06&name=hello-world&page=. Things are starting to look programmatically more obvious now, aren’t they?

Populating the query vars

The second task of function parse_request() is to obtain and populate the list of registered query variables. Query variables are either passed via permalink parsing (in our case for instance, year is 2010, via GET or via POST submission, and eventually saved into the $wp->query_vars object.

Fetching stuff from the DB

Now that parsing is done, the function WP::main() sends a few HTTP headers (the content-type, a “404 Not Found” is the request is an error, those things) and can finally start fetching data from the database.

After sending headers, WP::main() fires WP::query_posts() which will in turn trigger WP_Query::query().

This function will first parse_query(): WordPress knows the query variables defined during last step (in our case, year, monthnum, name and page), so it’s now going to sanitize values and fill in the blanks (it’s not a trackback, not an author archive, it is a single page, etc..)

Then, WordPress will finally get_posts(): now that it knows everything about the page we’re trying to display, WordPress is now able to perform the MySQL query that will fetch the post object.

This last function is rather long and complicated because it has a lot to do: figure out user permissions and capabilities, make sure the post is queryable and viewable, deal with the intricacies of tags and various taxonomies and so on, to eventually fine tune the MySQL query.

In our case, requesting the page http://mysite.com/2010/12/24/merry-xmas/ translates into this longish and scary SQL query:

Mission accomplished! Oh, one little task left to do while we’re at it: display information since we haz it.

Loading the theme

The last part of wp-blog-header.php is to load the aptly named template loader. This file will simply get the appropriate template file, if available and according to the template hierarchy. In our case, this will translate into loading and executing single.php from the theme directory, where most likely a loop will display data from the post object in a somehow pretty and readable fashion :)

All done

So, that was a pretty long and complicated path from the initial page request to the final SQL query, wasn’t it? The beauty of all this is that, as usual, WordPress allows plugins and themes to interfere with the process along the way: the Rewrite API for instance will allow you to create custom rewrite rules and define the associated query variables. But that’s another story :)

Comments (13)

Participate