Best Practices/Thinking Ahead

I'm about to set up a WPMU-based system that I expect to grow from a few dozen to a couple thousand sites over a period of 12-24 months. In the end I expect to peak at 4-5 thousand sites, maybe 50% actively updated, but that will take longer than 24 months.

Anyway, I'm trying to think ahead and come up with a plan that will be future-proof but not be exceptionally expensive right now. Anyone have thoughts about best practices here? What would you have done "right" if you started over? I'm thinking about using a dedicated server right now as the web and database server, and then add additional database machines as needed—but I've also considered using cloud hosting for easy additional database servers. What do y'all think?

Thanks!

Chris

  • drmike

    Use Elgg. :slight_smile:

    Sorry, couldn't resist although if Elgg did subdomains instead of the messy urls that it does, we probably would have made that (or another one whos name escapes me but it didn't do subdomains either.) our suggested platform for our blog farm installs.

    A couple random thoughts off of the top of my head.

    - Make sure all database and database tables are UTF8 and any new ones created are also UTF8. *cough* I wish more developers would write their database creation code to use the defaulted database type instead of just assuming that the server's default was the one used. *cough* *cough*

    - Go ahead and start off with the 256 databases from the start. It's really neat to look at the very least. Some of our installs also use that point as a selling point to talk about expansion down the road. ("Hey, we're all ready to expand when we need to!")

    - Docs, docs, docs. Sure, folks are going to ask the questions anyway but just think of that as a selling point when you point them to your docs. Plus they get linked to a lot which also helps SEO.

    - Ditch any wordpress link that you can find. Save the trouble of folks going there looking for help and not getting any. Point them to the correct places to get help, your forums and your help site.

    - Docs Part 2. If you make any change in the code, include any sort of theme, plugin, module, wrap, widget, whatever, make a note of it. I have on one of my servers a wiki where all of the platforms we support and install each has a page that lists everything we have installed with the version number and a link to where we got it from. We also have the specific downloaded version number of the software, what ticket or trac or version number it matches up to in their tracking system and links to where we need to go for help or contact information for stuff we have a license with. This is for my benefit as well as the folks we contract out our tech support with. [And yes, it's password protected since a fair amount of our code is either internally written or non GPL'ed.]

    The big plus to that is once a week, we just do right clicks off of those page to make sure we're running the most recent released versions of those software, plugins, whatever. Just wpmu is easy. 60+ platforms is a pain.

    - Get a buddy. I wrote up an article on Andrea's tutorial site about this. In case of emergency or your kids graduate or you meet some really sexy redhead, you will have to take time off. Find someone ahead of time to watch over things and to do the same for their site. It comes in handy.

    - Play dummy. Sign out of your site, clear your cookies and cache and try out your site. (Or go to your local library, use one of their computers.) Can you figure out how to sign up on your site? Can you find the docs or help section? Where do you go if you get stuck? Email support in case you can't signup? What if I never get any email? What if a comment never goes through? Pretend you're your ninety year old, grand mother in law who still thinks computers and the net is a fade and Susan Lucci was robbed all those years and give your site a try. Don't assume anything. How does it do?

    I'm sure I can come up with more stuff but I'm sure Andrea's going to start throwing things at me when she sees this. :slight_smile:

  • Chris M.

    Thanks for the response, Mike. Definitely good suggestions. My question was meant to be on the technical side of things, but it's always good to discuss best practices in general. For those out there that come across this thread, I'll second the importance of A) Good internal documentation, B) Having help lined up when you need it (if you're doing this on your own) and C) Looking at everything you've built from a number of different perspectives.

    That said, I also agree with UTF8-everything. Even if you don't think you'll need it, it's just so much easier than trying to transition later.

    I would love more thoughts on the number of databases to start with. I can think of a number of pros and cons with both less and more. Can you expand on your original comment, Mike? Anyone else want to agree (or disagree) with starting at 256?

    Thanks so much!

    Chris

  • drmike

    I haven't heard of a case of s3 losing files but they have gone down previously.

    but have a failover for that.

    Actually we've been thinking about that over on my forums. Thinking one cloud system as the main and another completely different system as the failover.

    We really don't have that large of an amount of uploads on our systems. None of our installs allow video files to be uploaded as we allow like every video system known to man to be embedded.

    Oh, and add backups to my list up there. :slight_smile:

  • Robert

    For what it's worth, we have about 25 Amazon EC2 instances running now, and we use a lot of S3 storage. Couldn't be happier with their services, and our hosting/support costs have never been lower. We've been growing to this point for about 18 months now as we get more comfortable with Amazon's reliability.

    We have our own load balancer instances in front of Web servers and scaling up/down is automated. I can't imagine going back to our own actual machines.

    Static IP addresses, persistent storage (database remains safe even if server crashes and has to be replaced) almost instant scaling, and pay by the hour. That's pretty cool stuff.

    We also still have about 40 Dell PowerEdge servers in various data centers, but those are being retired as they come out of service contacts and everything is moving to either Amazon or a company called Slicehost (recently acquired by RackSpace). Slicehost has been great to work with (excellent customer service) but they are much smaller so they don't offer the flexibility of Amazon.

    Amazon keeps adding new features as well. For example, they just announced beta availability for their own auto-scaling solution and their load balancer. More details here: http://bit.ly/aD5iI

    Read their forums carefully, especially if considering their beta services because they take "beta" seriously and things are not always stable.

  • Robert

    Just to elaborate on something I said last night, Amazon has been a great solution for us due to the scale we need, and can support. But if you are looking for really, really simple virtual hosting try Slicehost: http://www.slicehost.com/

    When I first stumbled on Slicehost about 18 months ago I read the information on their Web site and made myself an account. Less that 10 minutes after making an account I had a CentOS server (including Apache/PHP and MySQL) running a copy of my Web site.

    Their "control panel" couldn't be easier. One simple page to schedule daily and/or weekly backups. One click to launch a new server based on a previously-saved backup.

    If you are not a professional Linux system admin, these are the guys to use.

  • Aaron

    Ya Robert. I just built my first ubuntu server on slicehost and am working on migrating to them. New to Linux but they have great tutorials. Also added APC, which seems to have doubled the PHP speed.

    The idea is I can just increase the slice size as I need to scale, and eventually move the db to another slice.

    I'm really impressed with your amazon setup though. I looked a that first but decided it wasn't worth the effort for now. It would be nice to only pay for what you need though!

    Two questions: what software are you using to spin up new servers and do load balancng? Also how are you moving uploads to s3?

  • Robert

    Aaron, our hosting mostly is not Wordpress. We run both Wordpress and WPMU blogs but those are relatively small-scale projects for now. We manage a variety of sites, the largest gets about 100,000 visitors/day.

    For your new Ubuntu slice get a copy of s3sync (http://s3sync.net/) to copy files to/from S3. It's a set of Ruby scripts that use the same syntax as rsync. To access the S3 buckets from a Windows machine I recommend Bucket Explorer (http://www.bucketexplorer.com/).

    We also have CNAMEs set up for S3 buckets so the image/media storage looks like it's in our domain. By the way, doing that also lets you take advantage of CloudFront which automatically distributes your files around the world so they are served up near the users.

    Take advantage of the chat system Slicehost provides for tech support. I've found their people to be very helpful, and patient with the newbies.

    Have you tried resizing a slice yet? As a guy who's spent many hours in a cold data center upgrading servers it's way fun to upgrade ram or disk storage just by pressing a button and drinking coffee :slight_smile: Only a minute of downtime during the process and your bigger slice is running.

    Launching new server instances at Amazon is controlled by our own scripts. But that should be easier now that some kind of auto-scaling is available. I haven't used that yet, but it should be a nice feature. For load balancing we mostly use HAProxy (http://haproxy.1wt.eu/) running on a couple of CentOS instances, but again this is another thing Amazon has recently made easier.

  • Aaron

    Thanks Robert! I'll have to check out the s3sync. I do run my main themes files off of s3 with a cname to spread the load already.

    What i've been wanting to do is move users uploaded files to s3 after all resizing is done on the server. Slicehost doesn't give you very much storage space on their slices, and it would be nice to have an unlimited hard drive in the sky and offload image requests from my server. It would make my server backups very small as well. I just can't figure out how to make it possible to delete images on s3 through wpmu, and how to determine how much upload space they still have.

    Do you backup files stored on s3? And are your bandwidth charges for ec2/s3 pretty high?

  • Robert

    I'm not much of a programmer myself and don't know the Wordpress code anyway, but it probably wouldn't be that hard to find the lines that upload or access user files, media, or whatever. If you can find that, look at these PHP classes for S3: http://bit.ly/elUhH

    Just plug in your secret access keys and bucket names, and away you go. It might be a pretty simple matter to use those classes to make S3 access easy from the Wordpress PHP code.

    Going back to the Best Practices theme of this thread, your backup question is an interesting one and something that we discuss regularly. It's very tempting to throw everything up into S3 - it's really cheap, automatically backed up (by Amazon) and Amazon claims they have never lost even one bit of data from S3. I'm a big Amazon supporter and believe what they say, BUT I've been around a long time, and that has made me a bit (some might say, a lot) paranoid. So everything is in two or three places.

    Another thing to consider is that backup storage is amazingly cheap these days. My local Staples is selling terrabyte size drives for $149. (They also have 1.5 Tb drives for $179, but the customer experience hasn't been good with those.) And I have Comcast cable at home that gives me 6+ Mb/s download speeds. I plugged a few of those drives into an old desktop computer that was reformatted with Ubuntu. Now it lives in the basement making backups overnight. Peace of mind. Priceless.

    UPDATE:
    Almost forgot one more thing. There's an article on Amazon's AWS site called "Building a Small Business Backup System Using Amazon S3." The example is PHP code using the class I mentioned above. Take a look at this code for some good ideas: http://bit.ly/owqrC

  • Barry

    I agree with the majority of this. Particularly Dr Mikes "meet some really sexy redhead, you will have to take time off.", when that happens (again) you won't see me for dust, so my support buddy would have to be a very patient person.

    I've got a number of slices over at slicehost and haven't had a problem with them at all, I was with them before rackspace took over and when the speed of getting an instance up and running was a bit longer (they had/have a policy of not overloading physical machines with too many instances), but now they have near instant provisioning they are even better.

    I'm extremely paranoid when it comes to "guarantees" and up time so generally keeps backups of backups of backups. So take the following with that in mind :slight_smile:

    1. Each slicehost host has regular backups with slicehost itself. The advantage being that you can bring up a new slice from a backup if your main "running" instance has a problem.

    2. Regular backups of the dbs, user uploaded data to amazon s3 (using s3sync mentioned above), regular backups from s3 back down to local drives.

    3. An Amazon EC2 instance that matches (roughly) the slicehost slice in software and setups, so if worse comes to worse and slicehost has a major problem, I can bring up an EC2 instance with relatively recent data and switch over my DNS until slicehost recover. I've never yet need to do this though.

  • drmike

    We're discussing cloud computering elsewhere. Just wanted to pass along these links:

    http://www.cloudsecurityalliance.org/topthreats.html

    http://broadcast.oreilly.com/2008/11/key-security-issues-for-the-am.html

    http://broadcast.oreilly.com/2008/11/20-rules-for-amazon-cloud-security.html

    I;m not a big fan of the first site as I feel there's some disclosure issues but it;s still a good read.

Thank NAME, for their help.

Let NAME know exactly why they deserved these points.

Gift a custom amount of points.