multi-db: mysql instances vs. different databases in a single instance

Ok, I'm a little confused about the following concept.

It looks to me like the multi-db class is smart enough to work when the blog databases are on totally different machines... (the whole add_db_server() call in the db-config samples clearly shows hostnames/ip/port addresses for these db servers)...

But, the examples, all have the db's HOST and LAN HOST set the same, same IP, same port.

I think this means that these examples are actually hitting the exact same MySQLd instance... Is that correct?

So, here's my point of confusion...

Is it recommended that you run multi-db with a SINGLE mysql instance?
Does splitting the DBs really help Mysql that much?

I am mostly interested in moving to multi-db to improve performance, since with 6,000 blogs, mysql is dumping tables left and right as people hit across my site. I am assuming to get table caching improvements I really need to get these databases into separate MySQLd instances.

Comments?

  • Andrew
      But, the examples, all have the db's HOST and LAN HOST set the same, same IP, same port.

    That was to make it simple for people. The examples are dumbed down so to speak.

      Is it recommended that you run multi-db with a SINGLE mysql instance?

    That's entirely up to you. I've personally never had the need to run multiple instances on the same machine.

      Does splitting the DBs really help Mysql that much?

    It really depends on the size of your database but yes it can help a good bit.

    Thanks,
    Andrew

  • ZappoMan

    I guess this is where I get beyond my Mysql tuning expertise...

    I think the reason my servers aren't performing as well as I'd like them to... is because even setting the table cache to a huge number, I'm getting nearly 0% table cache hits... This doesn't seem completely crazy to me, since people cruising blogs will load tables, people signing up will load tables, some plugins will scan blogs, etc...

    But it would seem to me that if all these databases are going through the same MysqlD instance, then we'd still have a table cache hit rate problem... I assume the table cache, query cache, etc, are MysqlD instance specific.

    >>> Am I misunderstanding something really simple here? <<<

    Is there some other weird design constraint of mysqld where moving tables out of a database but on the same daemon instance all of a sudden makes the table cache perform better?

    Please excuse my ignorance... I think I might be just smart enough to be dangerous.

  • Andrew

    The mysql table cache is never going to be super efficient when dealing with blog tables unless you have a massive amount of resources allocated to the cache. Even at that it will only be mildly helpful.

    Where you're going to see the table cache kicking in is with your global tables. This is because there are only a handful of tables to cache as opposed to thousands.

    Any form of caching is the most helpful when you have it caching only the most requested data. Otherwise your just sticking a bunch of data in the cache they won't be accessed enough to make it worthwhile.

    I've never had a reason to run multiple instances on a single machine but I would assume you could allocate more resources to one instance. If that's the case then you might want to check into sticking the global tables in a db on it's own instance and bumping up the cache on that.

    Is mysql really performing that badly with just 6,000 blogs though? I've seen db servers with over a million tables on a single instance chugging along just fine.

    Thanks,
    Andrew

  • ZappoMan

    Is mysql really performing that badly with just 6,000 blogs?

    Yes, it is... at least in my opinion...

    I've got about 17 tables per blog, that makes 51 files per blog, with a handful of global tables we're up to about 300,000 files in the data directory.... The OS seems to be handling it, but maybe this is the issue, maybe the issue is just that mysql is being slowed down by the performance of the file system handling 300k files in a single directory.

    So I guess, I could imagine that the real issue isn't the table cache, as much as it is some other issue...

    When I run the sqltuning scripts, they mostly complain about table cache. That's why I assumed that that was the issue.

  • drmike

    I've personally never had the need to run multiple instances on the same machine.

    Matt actually recently mentioned in passing somewhere that this was overkill to begin with when they started wp.com. He stated that they put too much money into the hardware side.

    He didn't give specifics but the multiple database sites and all that was something they were promoting at the start so I'm pretty sure that this was what he was talking about.

  • ZappoMan

    Yep, I heard that as well. The funny thing is that earlier this year they ran into a problem where they didn't have *enough* hardware :smiley:

    It's a never ending battle

    Ok, so... what do you all recommend... at what point (number of blogs, number of tables, your favorite metric) do you:

    1) Move from a standard install of wpmu to a multi-db install?
    2) Move a database to a different MYSQLD instance on the same machine?
    3) Move a database to a different machine?

  • Andrew

    If you're going to have 100,000 blogs within six months I'd go ahead and split to 4096 databases. 256 can handle that load but apparently your growing fast.

      1) Move from a standard install of wpmu to a multi-db install?

    I start every install that I think will have more than 1K blogs with 16 databases. It's just easier to manage.

      2) Move a database to a different MYSQLD instance on the same machine?

    Only you can answer that. It depends on your hardware and how it's handling the load.

      3) Move a database to a different machine?

    Same as two.

    Basically you need to keep an eye on performance and if you're not making it through the peak hours ok then it's time to start expanding or at the very least seeing what you can do to get the load down (remove plugins, etc).

    Thanks,
    Andrew

  • ZappoMan

    We are currently running on a "large instance" AWS EC2 instance which is:

    7.5 GB of memory
    4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each)
    850 GB of instance storage
    64-bit platform

    I certainly don't expect that instance to work through 100k blogs... but it should be performing much better at 6K.

    The thing is that what I see in top is that mysqld will often spike to 100+% cpu. With about 10% memory usage.

    I am hoping that this is something weird related to the issue of having 300k files in a single directory. Maybe it's doing an opendir() readdir() or something like that, that is spiking the CPU....

    I guess I could play around with writing some test code to see if that is indeed the issue.

  • Luke

    Might try looking over at "them other forums" for a thread called "calling all large hosts", or something close to that. Also, a quick search of that site for my.cnf should bring up that topic, and another one that dives into a conversation with myself, Quentin, and James about working though and optimizing a mysql configuration. It won't be perfect, but should get you in the ballpark.

    I manage a site of comparable size to yours that's on a dedicated box. They also have 2 other instances of MU running on that same box with about 300 blogs each. The other two are more a hosting of wp based sites than a blogging community, but the big one on there is definitely a community.

    The server specs aren't exactly impressive, either.

    Dual Processor: AMD 2600 x 2
    2GB RAM
    2 x 80G SATA
    RHEL 5

    Nothing fancy about it, far from it these days. Anyway, it's humming along with mysql eating about 20% of the memory, and little to no cpu. It's also running Apache 2.2.8, which takes up around 10-15% cpu and about 10% memory. Note that these are average, as it can be a little higher or lower.

    Load average on this box barely cracks .40, and has decent traffic. Nothing earth shattering, but about 2M page views a month. Oh, it's also (against my better recommendations) running cPanel and WHM as well.

    With specs like what you've described, 6k blogs shouldn't be denting it.

  • ZappoMan

    With specs like what you've described, 6k blogs shouldn't be denting it.

    Yeah... that's what I thought... but then again, I'm not YET running multi-db... do you think it could be as simple as this single db=lots of files in a directory (300k in my case)=some strange OS interaction=the performance problem I see.

    By the way, I call it a performance problem, but it's not like users are complaining... I think they're so used to the internet being a big fat blob these days that they don't even complain... but I know something is wrong with mysqld spiking to 100%...

  • Luke

    Yeah, something is definitely wrong with mysql. Either through configuration, or something else.

    The site I referenced that is comparable, get this, it was running that way on dreamhost shared hosting up until mid march. All with a single database. Oh yeah, they were complaining about speed left and right.

    Once I (finally, after 6 months of hounding) convinced them to get on a dedicated server, it was like night and day. Breaking up the database also helped, don't get me wrong, but it wasn't such a severe change that it made it .001 sec page loads. Even in a single database, mysqld wasn't acting even close to what you described. It might have improved it by 20%, maybe a little more.

    Diagnosing mysql when it's acting like that is a pain in the ass. If you haven't specified a slow query log in your my.cnf file, you should.

    What I find interesting is that is isn't using memory, but using an assload of cpu. That shouldn't be the case, as it should be hitting memory and using it. That would lead me to think that your conf file isn't set up too good yet, or there are some issues with your installation of mysql.

    Oh, and it isn't the Lounge we're discussing. That's on a quad box (2 xeons, dual core) with 4G of RAM. Stats on it are a little bit more overall than that. :slight_smile:

  • ZappoMan

    I just found the following link... it seems to suggest that over about 10,000 tables (I have 102,000 tables!) MyISAM starts to go crazy for reads and writes...

    http://www.mysqlperformanceblog.com/2008/03/21/mysql-file-system-fragmentation-benchmarks/

    It sounds as if this issue may very much be related to the ext3 file system and/or disk level fragmentation... (at least that's what the author of this article found to be the case in his tests).

    So... this gives me some optimism that the issue may in fact be related to the number of tables in the database afterall.

    This may also give some good guidance on how many tables (or blogs) you can safely have in a database.

  • ZappoMan

    I'm convinced I'm making the move ASAP. That's currently the best choice for obvious "duh" improvement.

    The fact of the matter is that having 300k files in a directory is asking an aweful lot of any OS/fsys... and since there are many people saying this can be a problem, and logic tells you that you'd be lucky if it wasn't a problem. Occam's razor tells us this is probably the problem.

    I love the idea of implementing a better page caching system... but I see these performance spikes even when the system is not under any load... But, I haven't yet looked at xcache, which does look promising on the surface... again, I think this will be more likely the help if the system is under load, which it is not.

    Thanks for the tips.

  • Luke

    Strange Trent, I hadn't noticed that.

    Maybe they just don't know any better, so haven't complained. :smiley:

    Xcache did certainly help though, when the site is under heavy traffic. I've got one site/blog that gets a type of digg several times a week. Between better hardware (not the site I've been referencing, a different site/box all together), xcache, and super-cache, the box barely breaks a sweat.

  • ZappoMan

    I'm just kidding... actually... I do keep checking in to see if there's something else... but I really am intrigued by the xcache thing.

    As for Southern Colonies, I haven't learned much from this thread.

    But I will tell you a joke I once heard from the CEO of Starwave/Disney Online...

    What's another name for NASCAR?
    Southerners turning left.

    I figured you of all people would like that one. :slight_smile:

    Anyway, I really do appreciate the help you've given me.

  • drmike

    Just to throw this out here, someone kindly suggest my method may be best and should be looked to first on the caching.

    Cases in point:

    - I'm assuming from a previous discussion with Andrew that on installs that are within shared accounts, the file based object cache would probably be best. That's what we're currently doing and are currently planning on continuing to do. We are not using the supercache plugin and have no intention on doing so. (Not being able to do the permalinks is what holds us back on that. Too many non-latin languages out there.)

    - I've got our big install sitting on it's two servers and I'm wonder if that would be better served with a better approach though. From chatting with Andrew, an actual cache install like xserve would be better.

    - What about on a VPS? (We've been weighing moving the mu installs over to VPS if we were to get some more boxes. We now have 2 additional installs at 10k blogs although neither one gets that much real traffic.)

    - Donncha mentioned way back when that wp.com was using a custom memcache setup but was using that more for a ramdisk for css and custom css files.

    As to the Nascar bit, I've always wondered why they don't throw in an occasional race run in the other direction. Just to shake things up. (Maybe the All Star race they just had?)

    Thanks,
    -drmike

Thank NAME, for their help.

Let NAME know exactly why they deserved these points.

Gift a custom amount of points.