Slow sites, slow business

Many people today are too focused just on preventing service downtime in their Q&A efforts. It is still not so uncommon to find managers, or even testers, who think that actual downtime (as in 100% service interruption) is the only thing that will noticeably hurt business.

Well, lately there has been a lot of data piling up that disputes this, and shows us that performance is pretty important too. A slow site means lost revenue due to users disappearing to other sites. As quality of Internet services in general is slowly rising, Internet users are getting less and less tolerant of slow sites and services.

Velocity 09

At this year's Velocity conference, Shopzilla, Google and Bing all reported findings that show how slowness can affect business. Shopzilla (presentation video, slides) had made a major performance redesign of their site, that had improved page load times from about 7 seconds to about 2 seconds. This had resulted in a 25% increase in pageviews and a 7-12% increase in revenue.

Bing and Google (presentation video, slides) experimented the other way - they introduced delay to see what would happen. Bing found that slowing down searches by 2 seconds means 4.3% lower revenues, while Google saw a 400ms added search delay result in a drop in traffic that was just below 1%.

There are two interesting things to note about the Google findings. The first is that they noted that traffic remained lower even after the delay was removed, which implies that some users who experience bad performance might go away never to come back.

Earlier studies

Another interesting thing to note is that Google ran a similar experiment around 1999/2000, recounted by Google's Marissa Mayer first at the Seattle conference on scalability 2007 (video), and then also on Velocity 09 (video). She let a test group get 30 search results per page instead of the normal 10 in a Google search. Those 20 extra search results meant that the page loaded in 0.9 seconds instead of the usual 0.5 (i.e. also about a 400ms slowdown). This resulted in an astounding 25% decrease in usage of the service.

It is not clear why the decrease was so dramatic then, while the 400ms delay added in the later experiment only resulted in a <1% dropoff. It might have been that the earlier result was also affected by things such as lower user satisfaction with the page layout/usability due to the extra information on the page, but one thing that just struck me was that another factor that may have contributed is brand recognition. Google wasn't very well known in 1999/2000, and users may have less patience when testing a new and unknown search engine, than when using a service that they have already been using daily for years and have gotten to know and trust. If that is in any way true, it means that new brands and services have even more to gain from testing and optimizing performance.

But wait, there is more!

A recent experiment was made by Strangeloop networks, where they did some A/B testing and offered some website visitors an unoptimized version of the website, while others got an optimized version. The result they found was that the optimized version of the site exhibited 16% higher conversion ratio and 5.5% higher average order value, than the unoptimized version of the site. Here is the article

Finally, I want to recommend this nice article: how slow websites impacts visitors and sales by the hosting provider Peer1. The article is full of statistics and references and argues, among other things, that users are more impatient and want faster sites today than they did 10 years ago. It suggests a mathematical model for calculating visitor loss as a function of page load time. Read it.

 

 

Introducing week passes

Buy 7-day premium accounts

Load Impact is proud to offer the week pass, which allows you to buy premium accounts for just one week. This means we now have three different options:

  • the day pass - this gives you 24 hours of access to a premium account level
  • the week pass - this gives you 7 days access to a premium account level
  • the monthly subscription - this gives you 1 month's access to a premium account level and renews itself automatically unless cancelled

The prices are such that the more time you buy, the less expensive it gets per hour. There are three premium account levels you can purchase:

  1. Load Impact BASIC
    Allows you to run tests with up to 250 concurrent simulated clients
  2. Load Impact PROFESSIONAL
    Allows you to run tests with up to 1000 concurrent simulated clients
  3. Load Impact ADVANCED
    Allows you to run tests with up to 5000 concurrent simulated clients

 

For more information about account levels, go to our products page or see our FAQ entry: What do I get for free and what do I get if I buy a premium account?

 

Monthly visits and concurrent users

To determine how much traffic (how many visitors) your site can handle, you usually run a load test that simulates a certain number of users accessing your site at the same time, then reports how fast your site serves web pages to those simulated users. Many people encounter a problem when they want to do this; Most, if not all, load testing systems want you to specify how many concurrent simulated users should be used in the load test, but most people only know how many visitors their website has per day or per month. The question that comes up is then "how to convert visits per month to concurrent users?"

If we start with the source of information about site visits, Google Analytics is very popular as a means of keeping track of website traffic. It shows you lots of interesting information about your visitors and what they do on your site, and is used by a lot of site administrators today. Because of this, I will use Google Analytics in my examples below.

One thing you can see on the Google Analytics dashboard is the number of visits your site has had the last month ("Visits"). This is of course interesting as it gives you an idea of how much traffic your webserver has to handle in a month. However, it doesn't tell you the peak traffic load your server has to handle at some specific point during the month. Most people will want to make sure their server can handle the peak traffic load (and more) so as not to risk losing business because of slow page load times.

Peak traffic can vary a lot between days, depending on many different circumstances. To view the number of visits per day in Google Analytics, you just click the "Visits" link on the dashboard (circled on the image above). For our own site - loadimpact.com - we see that high-traffic days (for us typically tuesday-thursday) see 100-200% more traffic than low-traffic days (typically saturday and sunday). Your mileage may vary, of course. It all depends on the type of site and the type of users. Below we have selected a 7-day period and clicked on the "Visits" link for our example site.

We can see in this example that on monday, january 26, the site had a traffic peak with 622 visits.

 

 

 

 

 

 

If you find a single day where you have a lot of traffic, you can then proceed to find out something that is even more interesting - namely what the peak traffic is for any single hour that day. This is a bit more complex in Google Analytics, however - you have to create a custom report. The custom report interface is available through a link in the left hand menu:

 

When you're in the custom report interface you have to create a new custom report, then select what things you want your report to contain. You can specify different Metrics and Dimensions to use in the report. We will specify a single Metric and a single Dimenion in our report. The Metrics and Dimensions are grouped in different categories. The categories we will use are "Site Usage" and "Visitors". You can see them circled on the image below.

 

First, we click on the Metric category "Site Usage", which brings up a range of metrics we can use. Scroll down until you find the "Visits" metric, then drag it over to the empty metric box to the right.

Then close the "Site Usage" metric category and instead open the "Visitors" dimension category. Under that category you'll find a dimension called "Hour of the day". Drag this dimension over to the empty dimension box to the right.

 

Now you have a report that can show you the number of visits distributed over the different hours of the day, for the time period you have selected. Save the report and try viewing it.

Generating this report for the day you found that had the highest number of visitors, will likely show what the peak number of visits per hour your site has. Maybe you will see that at 3pm on that day, you had 1,000 visits. This value, plus a little margin, is very possibly the target traffic level you want your site to be able to support (while still responding reasonably fast, so users don't have to wait for pages to load).

On the screenshot above we can see that on the day selected, we got the most visits at 1600 hours (4:00 pm to 4:59 pm). This is the traffic for the peak hour of the peak day, so it is a fairly good approximation of the maximum traffic our site has seen.

When you have determined what number of visits per hour you want your site to be able to handle, you should of course test if your site can handle that level of traffic (with reasonably fast page load times). To do this, you have to run a load test. The problem then is that most, if not all, load testing systems will want you to specify how many concurrent ("simultaneous") users you want to subject your site to. They don't talk about users or visits per month, per day, or even per hour. So how do we translate those figures to concurrent users?

Converting visits per hour to concurrent users:

concurrent_users = (hourly_visits * time_on_site) / 3600

So, if you have 1,000 visits per hour, and each visitor stays on the average 3 minutes (180 seconds) on your site, that means you would have (1000 * 180) / 3600 = 50 concurrent users.

Note that you need to use "visits", and not "unique visitors", when calculating the number of concurrent users. A single physical person may visit a site twice during an hour, which will of course cause twice the load on the web server. So we want to count the number of times the site has been visited, and not the number of unique persons that visit the site.

Time on site

The time on site parameter is the average time a user spends on the site. This is also something Google Analytics will tell you. It is the "Avg. Time on Site" value shown on the dashboard.

To translate visits per month to concurrent users:

concurrent_users = (monthly_visits * time_on_site) / (3600 * 24 * 30)

So, you just divide by the number of seconds in a month, rather than as before the number of seconds in an hour.

Now, when you know how many concurrent users you want your site to be able to handle, you can set up your load test. If, for example, you calculate that your site experiences a maximum of 50 concurrent users, but you want to test that the site can handle occasional peaks of, say 50% more traffic, then you want to verify that your site can handle up to 75 concurrent users. A reasonable load test setup might then be a ramp-up load test that tests four different load levels, starting at 25 concurrent simulated users, ramping up to 50, 75 and finally 100 concurrent users. That test would show you what response times (page load times) users would experience in low (25 users) and high (50 users) traffic conditions. The load levels 75 and 100 simulated users would also show what happens when the site/service grows or you see an exceptional burst of traffic due to some external event (maybe a big news blog writing an article about your site, generating a lot of extra traffic).

 

Thanks to Ditlev at VPS.NET who gave us the idea for this article. He actually wrote one of his own also, that you should look at if nothing of this made any sense. It is available here

 

50,000 load tests!

Load Impact passes 50,000 executed load tests

When we launched Load Impact, at the beginning of 2009, we could never dream that the service would be so well received and so widely used. At the time of writing 51,623 load tests have been executed on loadimpact.com, and that number increases at a rate of over 200 tests per day.

In total, we have made over a billion successful HTTP GET requests since we went live, fetching over a million unique URLs.

Our unique and brand new web page analyzer is quickly becoming popular also, with almost 100 analyses run per day, and 5,000 analyses in total so far. Keep an eye on this analyzer in the near future - it is already quite possibly the best one you will find anywhere, but we have plans for some additional major improvements that will make it simply outstanding.

All in all, we are incredibly happy with how we have been received by the Internet community since launch, and we hope that you will continue using Load Impact for your load- and performance testing needs in the future. We urge all our users to get in touch with us and tell us what they like or dislike about the service, so that we may make it even better.

  /The Load Impact team

Mysterious Glogotypes

Something strange about Google's logotype

Google's front page must be the most frequently loaded webpage on the Internet today. With Google's heavy focus on performance issues, one would suspect that this page has been obsessively optimized. If your page is loaded hundreds of millions of times per day, not only is there actually money to be saved on eliminating a single byte that needs transferring, but the small difference in user retention that a microscopic speedup in page load time can result in is also worth a lot of trouble. Marissa Mayer said some interesting things about how important page load time is for Google, in her 2007 presentation at the Seattle conference on scalability. It is a 1-hour presentation but well worth watching. If you don't feel like watching a long video presentation, you can check out Steve Souder's recent article Business impact of high performance.

Anyway, when testing our spanking new web page analyzer recently, we noticed that when we analyzed www.google.com and had our analyzer emulate different user agents, the Google logotype image on that page came in different versions, depending on what user agent we pretended to be. Nothing really surprising about this fact. Older web browsers don't support new image formats, for example, so sending the logotype as a GIF image makes sense when Google detects that an older (or unknown) browser is used, while a PNG image might be appropriate for a newer browser.

But, why on earth chop up the GIF logotype into four different pictures?

To someone who isn't interested in performance issues, this may seem like the most boring non-question since... well, ever actually. But to us others it is a bit perplexing. Or is it?  You be the judge. This is what you get when you load www.google.com with Firefox 3.5.

  The image is clickable if you want to view the result interactively and learn more. It shows the page load diagram for www.google.com in our page analyzer, emulating Firefox 3.5 as the user agent. The first thing that happens is that we get redirected to www.google.se, because our request originated from an IP address Google correctly identified as swedish. Then we load www.google.se/ (the HTML) and finally the Google logotype, which is called logo_plain.png and is 7.4KB in size. A single PNG image in this case. You can see this PNG being rendered in the bottom right corner of the screenshot.

 

Now, if we try and change our user agent emulation and instead tell Google who we really are (the "Load Impact Page Analyzer"), we get something different:

Look at this - Google sends us the logotype as a GIF, chopped up into four different images. The images are called hp0.gif, hp1.gif, hp2.gif and hp3.gif. On the screenshot below the hp0.gif image is shown on the lower right. You can see that it is the "Goo" part of the Google logo.

 

Why do they do this? Compatibility-wise it makes sense to send GIF images when you don't recognize the user agent, because we might be using some old browser that can't support e.g. PNG images. GIF is probably the most widely supported image file format in existence. But why not send a single GIF?  Why the four different parts?

Is it performance-related? Using more than one TCP connection to fetch objects can result in performance gains if there are many objects to fetch, as more objects can be requested concurrently. With HTTP pipelining this might be less of an issue, but older browsers that can only speak HTTP 1.0 do not support pipelining. On the other hand, the total number of objects on the Google front page, is small. The objects themselves are small in size too. Old browsers might not use more than two concurrent connections. Maybe they will only use one. In that case, dividing an image into several parts is most likely bad for performance.

Also, if it makes sense from a performance, or any other, perspective to divide the logo into four separate images, why isn't logo_plain.png delivered as four separate images too, when we use Firefox 3.5 to retrieve the page? Firefox 3.5 is definitely multi-connection capable - Actually, one could call it a multi-connection fetishist, just like most modern browsers. The same thing goes for if it has something to do with Google's needs to frequently change its logo (i.e. if it wants part of it to be cached on the client side, while still being able to alter other parts of it). Why not do that for all modern browsers too?

Then we start testing other Google sites - google.com (without the redirect to google.se), google.cn and google.jp for instance, all deliver a single GIF image, rather than a PNG image. No matter what browser emulation we use. Why is that?  Why do we get PNGs here in Sweden, while americans and asians get GIFs?

We asked a very technically knowledgeable Google employee about all this, but he had no clue as to the reason, so now we're down to guessing. There must be some kind of strategy behind the decision to serve certain browsers a 4-GIF version of the logo, others a single GIF and yet others a PNG, but what is it? If it had been any site but Google, we would have guessed it was an artifact of how the different logotypes (different countries and regions have different Google logotypes on their localized Google sites) are manufactured, but considering how much effort Google spends on performance optimization, it just doesn't seem likely that this is the cause here. It should not be all that difficult to convert all logotype images to the same format, if it made sense from a performance and compatibility perspective.

Can someone who knows please help shed some light on this mystery?

 

← Previous  1 … 4 5 6 7 8 Next →