Archive for the ‘mysql’ Category

InnoDB Primary Key Clustering In Action @ Flickr

Sunday, May 8th, 2011

InnoDB clustered indexes is one of the most powerful, yet poorly understood optimizations.

The idea is that InnoDB will try to place items in primary key sort order together on disk. Thus, if your primary key is a numeric column called id, rows with id 1, 2, and 3 will be adjacent on disk, as will rows with id 21, 22, 23 etc. If you’re fetching rows that are adjacent when sorted on primary key, they will be adjacent on disk, resulting in sequential disk access. This sort of thinking is important when you make the decision to put on your big-boy pants and not just put your entire data set in memory.

The part that people don’t often think about is that composite primary keys follow the same rule. Tim Denike, Flickr’s DBA, demonstrates just how powerful the effect of clustered indexes can be:

The trick, as explained in the comments on the photo, is that most queries on Flickr end up being queries for multiple photos by the same owner (I suspect narcissism has a lot to do with that..) Thus if you have an index on owner_id, photo_id, the fact that the owner_id is the prefix causes photos from the same owner to be placed together on disk.

Obviously, since not all the photos are written to the database at the same time, some fragmentation is inevitable in the real world; nonetheless, it’s clearly a huge performance win. I believe some routine maintenance commands (OPTIMIZE? I can never remember all the different commands, someone will have to fact-check this..) will effectively “defrag” the table, moving all the rows into sequential primary-key order.

InnoDB clustering is covered in great detail in Ch. 3 of “High Performance MySQL

Abusing MySQL: The Federated Engine

Monday, February 8th, 2010

I don’t have quite the experience that Kellan and Richard do with wrangling databases (yet), but I have seen some relatively unorthodox stuff. I’ll write a quick note about something quirky we did back when I worked on internal tools at Yahoo!

The Problem

We had one central database that contained a lot of the information about the company’s infrastructure (say, db_central). Among other things, it contained information about users, user groups, and inventory, for some definition of that word.

There were several other tools built around it that had their own databases, but still relied on some of the same data, particularly the user-related tables. We wanted to be able to do joins across the databases, but you can’t do that easily when the databases are on different physical boxes.

You could set up replication on the box that the auxilary database is on (say, db_app1), but we couldn’t do that either – we were doing dual-master replication for HA, and a single MySQL instance can only slave from one host at a time.

Federated Engine

One of the advantages of being an internal team is the ability to stay on top of the ‘new hotness’. Since our datasets and userbases were always relatively small, we were able to upgrade frequently; we were on 5.0 and 5.1 fairly soon after they were released.

With 5.0 came the Federated Engine. It allows you to create a sort of shim for a table on a remote machine and access it as if it were a local table; most notably it allows you to join against the remote table.

Obviously, this sounds like a performance nightmare. Though we never tested it in a straight-forward setup (you’ll see what I mean in a second), and it might have turned out OK for this particular use-case, even at our relatively small size and low traffic, slow joins were a serious problem (at Yahoo!’s size, even the internal apps had multi-million row tables). Adding network latency to that was not something we were interested in.

The Prestige

This is where one of the (other) crazy Russian guys on the team thought of something awesome (can’t remember if it was Andrey or Alex.. neither has a blog, unfortunately).

We would set up an additional instance of MySQL on another port on each of the db_app1 masters (call them db_app1_plus). This instance would be a slave for db_central. Then, on the db_app1 instance, we would set up a Federated Engine table (aha!) that would point to the db_central replica on the db_app1_plus port over localhost. Though we never scientifically benchmarked this setup, in our limited testing it worked like a charm and performed beautifully.

Of course, as I said before, this would not work for a high-traffic production setup. However, it did allow us to simplify the code in our internal apps (and you always want that code to be simple and readable) and did not cause any noticeable performance degradation or additional operational headaches. As far as I know, that setup is still in place.

I’ve recently come back to this idea for some uses at Flickr (mainly background data-mining jobs) and keep forgetting to talk to Kellan about it. Let’s see what he says :D

Pull random data out of MySQL without making it cry – using and optimizing ORDER BY, LIMIT, OFFSET etc.

Wednesday, September 23rd, 2009


This isn’t a rocket science post. Your world view will not be changed by it. It’s just a trick I came up with while pulling random data out of a database that gives a pretty good mixture of results without straining the server.


  • Data that is random enough that a user will be entertained for at least the first 3 or 4 reloads
  • Low cost to the database

Ways not to do it

If your immediate response to seeing this post is “DUH, use ORDER BY RND(),” you are fired.

Step 1: random subset of non-random data

If you only need a few items (N), the easiest thing to do is to just grab N*k rows and pick randomly out of that set. In pseudo-PHP, it looks about like this:

    $n = 4; # gimme this many
    $k = 5;
    $limit = $n*$k;
    $q = "SELECT * FROM bicycles LIMIT $limit";
    $rows = mysql_give_me_the_fucking_rows($q); # AWESOME!

    for ($i=0, $random_items = array(); $i<$n; ++$i){
        $random_items[] = reset(array_splice($rows, array_rand($rows), 1));

The downside here is that your overall result set is essentially always the same. We can do better, though.

Step 2: random ORDER BY criteria

Examine the following simple table from our fictional bike shop

mysql> show create table bicycle \G
*************************** 1. row ***************************
       Table: bicycle
Create Table: CREATE TABLE `bicycle` (
  `id` bigint(20) unsigned NOT NULL auto_increment,
  `frame` tinyint(3) unsigned NOT NULL default '0',
  `in_stock` tinyint(3) unsigned NOT NULL default '0',
  `date_available` int(10) unsigned NOT NULL default '0',
  PRIMARY KEY  (`id`),
  KEY `available_and_when` (`in_stock`,`date_available`)

Assuming you want to show random bicycles that are in stock, the query from above changes to

$q = "SELECT * FROM bicycles WHERE in_stock = 1 LIMIT $limit";

However, notice the available_and_when key: it includes in_stock as the left most column, but also date_available. Since we’re already using the leftmost column in our query, we can sort using the second column while still using the index – FO FREE*!

mysql> select * from bicycle WHERE in_stock = 1 ORDER BY date_available LIMIT 5;
| id  | frame | in_stock | date_available |
... SNIP ...
5 rows in set (0.00 sec)

mysql> explain select * from bicycle WHERE in_stock = 1 ORDER BY date_available LIMIT 5\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: bicycle
         type: ref
possible_keys: available_and_when
          key: available_and_when
      key_len: 1
          ref: const
         rows: 86900
        Extra: Using where
1 row in set (0.00 sec)

Why does this matter? It means that you can take advantage of this when grabbing your set of non-random data. You can randomly pick which way you’re going to sort the table, then grab the $n*$k rows. Our code snippet becomes

    $n = 4; # gimme this many
    $k = 5;
    $limit = $n*$k;
    $order_candidates = array('date_available', 'date_available DESC');
    $order_by = $order_candidates[array_rand($order_candidates)];
    $q = "SELECT * FROM bicycles WHERE in_stock = 1 $order_by LIMIT $limit";
    $rows = mysql_give_me_the_fucking_rows($q); # AWESOME!

    for ($i=0, $random_items = array(); $i<$n; ++$i){
        $random_items[] = reset(array_splice($rows, array_rand($rows), 1));

Depending on your query and the indexes you have available to you, you could have numerous sorting possibilities. The more different ways you can sort, the more random your data will be. If the randomness of this data is truly important, you can add indexes to accommodate more ways of ordering.

DO NOT BE A COCK! Run every single possibility through explain. If you fuck up and order by something stupid, your DBA/Ops will murder you, and your site will be balls slow.

Step 3: Random OFFSET

This could very well have been Step 2, but I like this part less, so it goes later. This part requires a more extensive use of common sense, thus leaves more room for stupid.

Now that you’ve got some different ways you could order the data, you could also broaden your range by specifying a random offset. There are two caveats:

  1. You don’t always know how many rows you have in total, and finding out could be expensive
  2. You don’t want too high of an offset, because that will be slow

Since #2 means you should never even THINK about #1, I guess there’s really only the one caveat.

What do I mean? Let’s look at some queries against my measly 500k rows table:

mysql> select * from bicycle WHERE in_stock = 1 ORDER BY date_available LIMIT 5 OFFSET 0;
... SNIP ...
5 rows in set (0.00 sec)

mysql> select * from bicycle WHERE in_stock = 1 ORDER BY date_available LIMIT 5 OFFSET 5000;
... SNIP ...
5 rows in set (0.07 sec)

mysql> select * from bicycle WHERE in_stock = 1 ORDER BY date_available LIMIT 5 OFFSET 50000;
... SNIP ...
5 rows in set (1.15 sec)

mysql> select * from bicycle WHERE in_stock = 1 ORDER BY date_available LIMIT 5 OFFSET 100000;
... SNIP ...
5 rows in set (1.97 sec)

mysql> select * from bicycle WHERE in_stock = 1 ORDER BY date_available LIMIT 5 OFFSET 250000;
... SNIP ...
5 rows in set (4.89 sec)

The reason for the degrading performance is that MySQL has to generate the entire result set, then throw away everything below the offset. However, we can see that the performance is tolerable up to a point. We can certainly afford to pick from the first few thousand rows (or hundred, depending on your requirements).

Let’s implement the random offset:

    $n = 4; # gimme this many
    $k = 5;
    $limit = $n*$k;
    $order_candidates = array('date_available', 'date_available DESC');
    $order_rnd = array_rand($order_candidates); # separated for better cacheability
    $order_by = $order_candidates[$order_rnd];
    $offset_rnd = rand(0, 20); # separated for better cacheability
    $offset = $offset_rnd * 100;
    $q = "SELECT * FROM bicycles WHERE in_stock = 1 $order_by LIMIT $limit OFFSET $offset";
    $rows = mysql_give_me_the_fucking_rows($q); # AWESOME!

    for ($i=0, $random_items = array(); $i<$n; ++$i){
        $random_items[] = reset(array_splice($rows, array_rand($rows), 1));

Working around OFFSET performance degradation

If you just can’t sleep well, knowing that you are not pulling rows from the entire table, you should really seek professional help – something far beyond whatever I can provide you with. Seriously, who gives a shit? However, I do have something to tide you over. Among other places, I’ve seen this workaround suggested in this MySQL Performance Blog Post. If you have a sequential column you can use (i.e., that is included in an index you can use for your query), you could use a WHERE col BETWEEN x AND y clause to simulate random offsets. Caveat: data may not be evenly distributed, so you might hit a range that has no rows in it, or not enough rows for your random sample.

For this scenario, MySQL does do some very handy things. If your table looks like mine, i.e. has an index on the column we’re selecting by AND the column we want to use for our offset, you can get the MIN and MAX bounds FO FREE**:

mysql> select MAX(date_available),MIN(date_available) FROM bicycle WHERE in_stock=1;
| MAX(date_available) | MIN(date_available) |
|          1253385344 |          1249496986 | 
1 row in set (0.00 sec)

mysql> explain select MAX(date_available),MIN(date_available) FROM bicycle WHERE in_stock=1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: NULL
         type: NULL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
        Extra: Select tables optimized away <<< fucking sweet

Once you have the minimum and maximum timestamps, you can just pick a random range. As I said above, you could get yourself an empty result set, so be sure you only do this with uniformly distributed data and a wide enough range.

A final note: don’t forget to fucking cache

Although the entire purpose of this post is to show how to do these things in a cheap way, it’s still important to remember: MySQL time is much more expensive than memcache time. For something like this, you should definitely be caching. The strategy is very simple: generate your query params (limit, order direction, offset), then use those to build your key. Check that you don’t already have this result set in cache; if you don’t, query the database and store it in the cache using that key. The next time you get the same random criteria, you’ll be able to get the rows – FO FREE***. For our last code example, the cache key could look like this:

$key = "BicycleRandom_$limit_{$order_rnd}_{$offset_rnd}";

That’ll do it!

* By ‘FO FREE’ I mean free, aside from the cost of maintaining the index
** This time, I really did mean FREE
*** Or almost free, anyway

Drizzle: Time To Get Excited

Tuesday, March 3rd, 2009

Brian Aker was in San Francisco on Monday night to talk about to the MySQL and PHP Meetup groups about Drizzle, the fork of MySQL aimed at meeting the needs of large web applications.

I had heard various tidbits here and there about Drizzle, including at MySQL2008, and had a pretty vague impression of how it was actually going to be different. The general theme seemed to be simpler, smaller, and faster.

If everything Brian described is true, it’s those things plus FUCKING AWESOME. I will join my friends (here, for example) in foaming at the mouth about it. Below is a rough list of things Brian mentioned that get me all hot’n'bothered about Drizzle, in no particular order:

  • No Query Cache. The best part here is the reasoning for its absence: “If you’re relying on the query cache, you probably should have just used memcached to begin with.”
  • In-Query Sharding Info. I’m not exactly sure of the details of the implementation, but it sounds like Drizzle will make it possible to spray queries through a to the right shard automatically.
  • Pluggable Authentication. Finally. Authentication can also be completely turned off.
  • Serializable Query Plans. This feature will allow the parser to be bypassed entirely on most queries. You simply send the query to MySQL, get the execution plan back, cache that, and send the execution plan back the next time you need the same query. That is Fucking Bad Ass (TM).
  • Fewer Locks. Getting rid of a lot of the more advanced and rarely used features introduced in MySQL 5/5.1 (views, stored procedures, etc.), as well as some of the more basic stuff like authentication, has allowed Drizzle to lose 2/3 of the locks present in 5.0 (at least that’s what I think Brian said… sounds surreal…) which obviously opens the door for vast improvements on multicore architectures. It also sounded like Brian made the decision during his talk to discontinue MyISAM support in Drizzle to be able to get rid of another huge lock. I’m OK with it…
  • Discontinued support for antique hardware. Lots of code ripped out because it’s no longer needed.
  • Everything is pluggable, most things are optional. The Drizzle kernel is about 115kloc. Amazing.

Brian was very coy about benchmarks, leaving it to independent sources to run them, but it sounds like Drizzle will leave MySQL in the dust for most of the common applications seen in webapps. I can’t wait to try it and to use it on a few things.

where the eff have i been?

Wednesday, April 16th, 2008

As I’ve previously mentioned, I am at MySQL 2008 in Santa Clara. I’ve been busy going to the talks, hanging out, and soaking in the weather.

Coding wise, I’ve been at a standstill – I had to do a bunch of stuff for work, came up with a mess of ideas that I had to filter and decide what I actually wanted to work on, and then realized I didn’t want to start working on anything else until I finished ked_duffel, or at least got it to be usable. That itself became a bottleneck as I struggled to figure out how to handle variables within files, but I’ve come up with what I think is a relatively elegant, cheap, and maintainable solution, so I’ve started development. I’ll post details as I implement the stuff

The Conference

I have learned quite a bit of stuff. As I work at a place where we don’t have dedicated DBA’s, but are getting to the point where at least minimal optimization and scaling are becoming a necessity, it was great to hear some talks about efficient scaling, caching, and proper monitoring/query analysis and optimization. Of particular interest and use were the memcached talk and the various discussions of stored procedures. Though only useful in the future (it’ll be a while before MySQL 5.1 is available to us), partitioning also seems like something that would help us out a great deal.

All in all, I’m getting my (read: Yahoo’s) money’s worth and having a good time. I have generated lots of ideas on how we can drastically ameliorate a lot of our performance issues and make our database much more maintainable.

One thing that kind of sucks is that a lot of the talks about monitoring/alerting ended up a waste, as they talked about opensource products that we had looked at and ended up developing our own due to scale issues.