Blog

Wordpress load test part 2

NOTE: This post was updated after it was first published. Please click here for explanation.

This is the second part in a series of posts about
Wordpress and performance. In part 1, we took a look at Wordpress in general. In this part, we'll continue the investigation and look at a few specific plugins that can help improve performance.

First things first, in part 1, we used a 8Gb Quad core server for the tests. From now on, we've moved to a KVM virtual server. The main purpose of that is that we change the machine configuration when something interesting is discovered. For instance, if we discover a performance problem and suspect RAM memory to be the bottleneck, we can add memory to the machine and rerun the test. The obvious downside is that the baseline established in part 1 isn't valid anymore. So the first task is to examine how this virtual machine handles load as described in part 1.

The base configuration for the virtual server is 2 CPU cores running at 2.1 GHz with 1024 MB RAM memory. The OS Ubuntu JEOS upgraded to 9.04. Apache2 is at version ___, PHP5 is up to version  . The MySQL server is located on the same machine and is running 5.xxx. Wordpress is upgraded to version 2.9.1.

The baselines. A simple PHP script that sends 10 bytes of data back to the user has an average load time of 85 ms when running 80 concurrent users. That's actually pretty much the same number as we saw on the 8Gb Quad core machine, we had 80.9 ms on that machine.

Next thing we looked at in the first part was the average load time for a basic empty Wordpress install. On the Quad core box, we saw an average load time of 357 ms for 80 users. On the virtual machine, not so good. A ramp up test going from 50 to 80 concurrent users shows load times at 691 ms for 50 users and more or less infinite at 60 users. At that load level, the kswapd process was eating away a good 66% of all available CPU, meaning that the server spent most of it's time swapping pages back and forth between RAM and disk. Even if nothing actually crashed, we aborted the test and concluded that the current config can't handle more than 50 concurrent users.

For the final baseline test we added 10 posts into the Wordpress install and made a new measurement. On our virtual machine, 50 users gave us a load time of 1220 ms, the same load on the Quad core machine gave us 470 ms response times. Clearly, taking away 2 processor cores and slashing the RAM memory to 1/8th affects average load times badly which is not surprising at all. Anyway, we now know that our current test environment is unlikely to handle more than 50 concurrent users and we also know what happens if we add RAM and/or CPU cores.

 

Tweaking Wordpress performance

There are numerous of ways to increase wordpress performance and we'll have a look at how the numbers gets affected in this particular installation. Now, Wordpress wouldn't be Wordpress if the most interesting performance tweaks was already packaged as easy to use plugins, so instead of digging deep into the Wordpress core, we ended up evaulating a set of interesting plugins, here they are:

wp-cache plugin

The wp-cache plugin have become very popular way to add a chache to Wordpress. Wordpress used to have a built in object cache, but that got cancelled in Wordpress 2.5. So today, the wp-cache plugin is one of the most obvious plugins that come to mind when wanting to tweak Wordpress performance (and yes, we'll look at wp-super-cache as well). The test result with wp-cache is very good. As we've seen above, this server will need 85 ms to server the simplest possible PHP script and the wp-cache plugin gets us fairly close to that ideal number.

Average load time 50 users: 210 ms

Baseline difference: -1010 ms

Baseline difference %: -82.9%

 

batcache plugin

Batcache was written to help WordPress.com cope with the massive and
prolonged traffic spike on Gizmodo's live blog during Apple events.
Live blogs were famous for failing under the load of traffic. Gizmodo's
live blog stays up because of Batcache. The developers of Batcache actually refer to WP Super Cache themselves as a better alternative, but in some cases with multiple servers and where memcached is available, Batcache may be a better solution. The performance gains with Batcache is actually not up to par with what wp-cache or WP Super Cache delivers, but it's still a lot better than a standard Wordpress install.

Average load time 50 users: 537 ms

Baseline difference: -683 ms

Baseline difference %: -56.0%

 

WP Super Cache plugin

The WP Super cache plugin takes things a few step further compared to the standard wp-cache. Most notably, by using a set of Apache2 mod_rewrite rules, WP Super cache is able to serve most of your Wordpress content without ever invoking the PHP engine, instead the content is served at the same speed as it would serve static content such as graphics or javacsript files. Installing this plugin is a little bit more complicated and it requires both mod_headers and mod_expires Apache2 modules to be enabled. But once installed, it really works, just look at the numbers! If using the WP Super Cache plugin works on your server, it's probably the easiest and most powerful way to boost your Wordpress performance numbers. And if it doesn't work as intended on your server, the good thing is that it reverts back to the functionality provided by the standard wp-cache plugin.

Average load time 50 users: 112 ms

Baseline difference: -1108 ms

Baseline difference %: -90.8%

 

 

W3 Total Cache plugin

The W3 Total Cache plugin is a powerful plugin that takes the best from wp-cache and batcache and adds a few additional features to improve performance. W3 Total cache allows the user to choose between disk and memory based caching (using memcached). It also supports minifying HTML, JS and CSS files as well as the various types of http compression (deflate, gzip etc.). Finally, W3 Total cache supports placing content on a content delivery network (CDN) that can speed up loading of static content even further. W3 Total Cache have a lot of configuration options and we did not take the time to fully investigate them all. We did test the performance difference when using disk based caching and memory based caching and the difference is actually notable. We enabled minifying and compression but we've pretty much used everything else 'out of the box'.

Using disk cache:

Average load time 50 users: 256 ms

Baseline difference: -964 ms

Baseline difference %: -79.0%

 

Using memory cache:

Average load time 50 users: 367 ms

Baseline difference: -853 ms

Baseline difference %: -70.0%


Summary

Results

NOTE: This table was updated after it was first published. Please click here for explanation.

PluginAvg. load timeDifference

Difference %

Standard Wordpress12200

0 %

wp-cache210-1010

-83 %

batcache537-683

-56 %

WP Super Cache112-1108

-91 %

W3 Total Cache (disk basic)256-964

-79 %

W3 Total Cache (disk enhanced)109-1111

-91 %

W3 Total Cache (memcache)367-853

-70 %

 

Conclusions

The various performance related plugins for Wordpress all revolves around caching. The most impressive results was acheived using WP Super Cache and W3 Total Cache. Among the other plugins, the choice is between disk based caching and memcached based caching. Our tests actually show that disk is faster, but that's something that needs to be explored further. The tests have been done on a blog with very little data in it and Linux uses a fair amount of disk caching that is probably more effective with these particular amounts of data. Whenever WP Super Cache is not possible to use (or simply feels too exotic for you), we suspect that a perfectly tuned W3 Total Cache is the best choice. W3 Total Cache shows the most potential for tuning and we like the overall 'look-and-feel' of it. UPDATE: Actually, after retesting W3 Total Cache, we think it may an even better alternative than WP Super cache. The one negative thing we've picked up so far is a potential compatibility issue with Wordpress Multi User (WPMU), but we have not been able to confirm that.

 

Feedback

We want to know what you think. Are
there any other specific plugins that you want to see tested? Should we
focus on tests with more users, more posts in the blog, more comments?
Please comment on this post and tell us what you think.

 

 

 

Load testing Wordpress

This is the first part in a series of posts about Wordpress and performance. In part 1, we'll look at Wordpress in general, in later instalments, we'll look at how performance is affected by popular plugins and tweaks. (click here for part 2)

Wordpress is probably the most popular blogging platform out there today powering a countless number of blogs and other web sites. Wordpress was first released back in 200x and quickly became a popular tool for bloggers. Part of it's success is due to the fact that it's remarkably easy to install and configure. Another big hit with Wordpress is the community of users that contribute to the development by creating plugins. There are plugins for just about anything, display Google ads, search engine optimization, statistics, integration with social media just to name a few.

There are also downsides to Wordpress but the one that interests us the most is performance. Wordpress was once known to have lacklustre performance properties. It especially had big problems handling a lot of concurrent users. Imagine the disapointment from a young and aspiring blogger that writes endless amounts of excellent blogposts without being able to reach out to a bigger crowd. When he finally catches a break and gets that link from news.com, the Wordpress powered blog dies under the pressure and before the blog is back up again, that link from news.com is yesterdays news.

But Wordpress have gotten better. The default installation is faster out of the box and there are numerous of tips, tricks and guides on how to optimize Wordpress performance beyond what should be possible. And of course, there are also plugins that helps Wordpress to perform better. Our mission is to find out what the state of Wordpress performance is today. Let's start.

The tools

The tools we brought to the lab to do this series of tests are fairly simple. We have an out of the box Wordpress 2.8.6 blog installed on a single server. The server run Ubuntu Linux 9.04 on a Intel Quad Core 2.1 GHz machine with 8Gb RAM memory. The web server is the standard Apache2 that comes with Ubuntu and the database server is MySQL 5.1 located on the same machine. PHP is up to version 5.2.10. And the most important piece we brought was naturally a LoadImpact.com account to run the tests.

 

Establish a base line for the server

There are lot of moving parts in a setup like this. We first want to establish a baseline that tells us the maximum possible throughput on a PHP page in this specific setup. To do that, we created an simple php script that echoes exactly 10 bytes of data back to the browser. By load testing this simple script we get an understanding of how much of the page load time that is spent just on sending requests back and forth over the Internet, how well Apache2 can fire up the PHP processes and how much time PHP needs to initialize itself.

The baseline test and all the other tests we will be running is a ramp up from 50 up to 80 concurrent users. This is what the graph from the test look like:

Base line test. The performance of the server itself

As you can see. The response times actually gets better with more concurrent users (that's caching), overall it stays at or under 100 ms. So before putting Wordpress into the picture, we see response times at just under 100 ms. That's the best possible response time we could ever achieve with PHP at this load level on this particular server located at this particular IP.

 

Establish a baseline for Wordpress

Ok, next step is to see what we get when we install Wordpress. A standard Wordpress install will first and foremost run a whole lot more code than the previous script. It also connects to the database and looks for blog posts, comments and a lot of meta data such as what categories that are in use etc. So naturally we expect to see longer response times. We placed the same amount of load on the front page of the Wordpress installation as we did on empty.php, here's what an empty Wordpress blog looks like:

Performance when Wordpress is installed

 

The response times now begins at just over 300 ms at 50 concurrent users and at 80 the response times are just over 350 ms. But that's not very interesting, we need to add a few posts so that the scripts and database gets some actual work done. Here's what the graph looks like when 10 posts are added to the blog:

Wordpress performance with 10 posts added

That's a bit more interesting. We response times now starts at 425 ms, dips down to 364 ms at 60 concurrent users (mysql caching is our best guess here). At 70 and 80 concurrent users, the response times start rising quite sharply to 439 ms and 601 ms respectively. That actually starts looking like it's a "knee", the point where performance starts to really degrade and the server risks grinding to a halt.  Let's see what happens if we add even more load:

Wordpress load test with more clients

Yes indeed. With more clients, the response times increases even more, as expected.

In absolute numbers, we're still talking about very good response times
here even if this test is using a server with more CPU and RAM memory than the typical Wordpress installation have exclusive access to. And we are also looking at fairly high load levels. Getting 150 concurrent users on your blog is not going to happen to a lot of people and maintaining a reponse time of well under 2s is not bad at all.

The second thing to notice is that what we first suspected was a "knee" in the response time chart between 60 and 70 users does not look like a knee at all any more. The response times increases more or less proportionally to the load which is quite good. A really really high performing web site out there would display a more or less flat line for this load levels, but our setup is no where near that k

 

Conclusion

We've established a base line for Wordpress performance. We're going to keep testing this setup with various types of tweaks and write about it. The next part of this series will look at different plugins and how they affect performance, we've already tested a few of the most popular ones and some of them do affect performance quite a bit.

Feedback

We want to know what you think. Are there any other specific plugins that you want to see tested? Should we focus on tests with more users, more posts in the blog, more comments? Please comment on this post and tell us.

 

(click here for part 2)

 

Scheduled tests

Keep your server awake while you are sleeping!

Today, we launched a new Load Impact premium feature - scheduled tests. You can now configure a test that is run at some specific time in the future, or you can configure it to run once every day, or once every week.

This functionality is useful to people who want to run a load test during low-traffic hours (often in the middle of the night, or early mornings) but who don't want to sit up and press the "start" button at the desired time.

It is also good if you want to run the same load test repeatedly, maybe once a week, to get a history of the performance of your site.

Scheduled tests are accessed through a new option in the main menu

 

The test scheduling screen shows your currently scheduled tests, and lets you schedule new ones. Note that if you schedule repeating tests, your account will accumulate test results over time. As an account can only have a certain number of test results stored, you get may choose to have new scheduled tests delete old test results, if necessary, to make room on your account.

Test scheduling screen

 

Infinite slashdotting

How many times can you get slashdotted?

Slashdotting (or "the slashdot effect") is a term coined by the site slashdot.com. It means that some big website/blog/newssite writes an article about you, causing tons of their visitors to take a sudden interest in your website, and causing your site to get overloaded with all the new visitors.

Slashdot has been around for a long time, but now there are many other sites that are big enough to cause a slashdot effect when they publish an article about something. One such site is the Russian web developer site habrahabr.ru where they call the effect the "habr effect".

We have had extensive experience with the "habr effect". We were mentioned on habrahabr.ru in february 2009 the first time, which caused some traffic to come our way. However, it seems habrahabr.ru has increased its number of readers substantially since then - they seem to have around 10x as much traffic today as they did early 2009, according to Alexa.

So, on wednesday they published an article about the importance of load testing as a way to avoid the habr-effect. They suggested people use Load Impact for their load testing (which we think is a splendid idea, of course). This resulted in a lot of people coming to our site to try out our load testing service.

This wouldn't have been a problem under normal circumstances, but an unknown bug in our frontend code had made it possible to start an unlimited number of load tests. We never noticed this under normal traffic conditions, but when several hundred new visitors arrived at the same time from habrahabr, and many of them tried to start a free load test, we suddenly found ourselves executing close to 200 load tests at the same time!

Our system was having problems: We had been habr'd (slashdotted) because of an article about how to avoid getting habr'd.

the effect the habrahabr article had on our (concurrent) site visitors

After some frantic bug-hunting, we found and fixed the frontend bug, and things started working much better.

Then we thought "hey, this was somewhat funny". We decided to write a blog article about the load testing people who got overloaded because of an article saying you should use their load testing service to avoid getting overloaded.

I wrote and published that article on our blog yesterday (thursday).

Today (friday) habrahabr picked it up and published a link to it. Guess what happened?

Yes, same thing (but a longer spike this time, so more traffic)

 

So today we have achieved something remarkable:

  • Today, we were habr'd because of an article about being habr'd because of an article about how to avoid being habr'd!

 

All programmers out there will love the recursion, I bet.

 

Note: to be honest we didn't have much problem with the traffic today, even though it peaked at more visitors per hour than we normally get in a whole day.

 

 

Habrahabr перегрузка!

Russian site habr's Load Impact!

Yesterday, the Russian site habrahabr.ru wrote an article where they warned people about the habr effect (see slashdot effect) and suggested that it was prudent to use Load Impact to load test your site before getting swamped by traffic due to some popular blog or newssite (like habrahabr.ru) publishing an article about you.

This, of course, caused us to get habr'd!

We found that our system was suddenly struggling to keep up, and even though we could see that we had a big traffic spike, we didn't at first understand why the machines were having such a hard time. This is what our concurrent (simultaneous) visitor graph for the past week looks like:

Now let's see, is it possible to determine when the habrahabr article was published? Tricky.

As can be seen, our average traffic this past week has been about 30 concurrent visitors, with a max of around 45 users on the site at the same time. When Habrahabr published the article we suddenly got close to 200 concurrent visitors.

Now, our system is designed to handle more visitors than that. We have had some 300 or so concurrent visitors in the past, when articles have been published about us, but it has not caused a very big problem for our servers. Yesterday, everything slowed down to a crawl, which was very strange.

It all turned out to be due to a malfunction in our test queueing system. As each load test can require quite a lot of system resources to run, we have a queueing system that makes sure we don't try to run too many load tests at the same time. Normally, we wll only allow about a dozen concurrent load tests running. But as it turned out, the queueing system was malfunctioning, and let visitors run as many load tests as they pleased. Under normal traffic conditions, we didn't notice the problem, but when 200 habrahabr.ru visitors all started load tests at the same time, our system suddenly got quite busy.

At one point there were 180 load tests running at the same time - We were load testing close to 200 sites at once!  (must be some kind of new record)

Luckily, practically all of these were small (free) tests and our load generator nodes were actually almost idling despite the excessive number of tests running. The loadimpact.com website, however, had problems. Especially the database had problems keeping up with all the writes caused by test results flowing in from so many concurrent load tests.

This situation went on between about 2 pm (Russian Moscow time - no, we're not russian, but most of the visitors from habrahabr.ru are) and 4 pm, then we found and fixed the problem with the queueing system, causing the number of running tests to go down to normal levels again. So to any of you out there who tried to use Load Impact or run a load test yesterday between 2 and 4 (noon and 2 pm UTC, or early morning in the US), please excuse us and please try again!

 

Another update

Some people seem to have misunderstood the numbers and what actually happened. I'll try to describe it in other words.

What we initially thought was that we just had a website visitor spike of about 200 concurrent visitors (don't confuse this with visitors per hour, HTTP requests, or visitors per day - see this article for an explanation). We couldn't understand why our system was so slow when it has been designed for up to 300 or maybe even 400 concurrent visitors (10-20x our normal traffic).

As it turned out, it wasn't the number of visitors on our website that caused our system to slow down. It was the number of load tests we were running.

People can run free load tests from our start page, and each free load test we execute means that we start up to 50 concurrent simulated users that access an external website that is to be load tested. Those 50 simulated users might load thousands of objects/resources from the external site, and the load test continuously updates our master database with information about how fast different objects on the external site are delivered to the simulated users. We can get hundreds of such load time results per second from a single load test, all of which go into the database.

Normally, we allow about a dozen concurrent load tests, but in this case a software bug made it possible to start an unlimited number of load tests. As most of the visitors were web developers interested in load testing, most of them started a free load test for their site. This meant that we at one point had about 200 load tests running at the same time, generating probably tens of thousands of database updates per second. This was more than our database server could comfortably handle.

Like everyone else, we have to judge what performance levels our systems should be able to handle, and build things accordingly. We try to make sure our system can handle at least 10 times the normal average traffic, which usually makes us able to handle a Habr or Slashdot effect, but in this case a silly little bug killed one of our most basic performance-protecting features, which kind of put a spanner in the works, so to speak.