Optimizing Page Load Time

It is widely accepted that fast-loading pages improve the user experience. In recent years, many sites have started using AJAX techniques to reduce latency. Rather than round-trip through the server retrieving a completely new page with every click, often the browser can either alter the layout of the page instantly or fetch a small amount of HTML, XML, or javascript from the server and alter the existing page. In either case, this significantly decreases the amount of time between a user click and the browser finishing rendering the new content.

However, for many sites that reference dozens of external objects, the majority of the page load time is spent in separate HTTP requests for images, javascript, and stylesheets. AJAX probably could help, but speeding up or eliminating these separate HTTP requests might help more, yet there isn't a common body of knowledge about how to do so.

While working on optimizing page load times for a high-profile AJAX application, I had a chance to investigate how much I could reduce latency due to external objects. Specifically, I looked into how the HTTP client implementation in common browsers and characteristics of common Internet connections affect page load time for pages with many small objects.

I found a few things to be interesting:

Using these, I came up with a model to guesstimate the effective bandwidth of users of various flavors of network connections when loading various object sizes. It assumes that each HTTP request is 500 bytes and that the HTTP reply includes 500 bytes of headers in addition to the object requested. It is simplistic and only covers connection limits and asymmetric bandwidth, and doesn't account for the TCP handshake of the first request of a persistent (keepalive) connection, which is amortized when requesting many objects from the same connection. Note that this is best-case effective bandwidth and doesn't include other limitations like TCP slow-start, packet loss, etc. The results are interesting enough to suggest avenues of exploration but are no substitute for actually measuring the difference with real browsers.

To show the effect of keepalives and multiple hostnames, I simulated a user on net offering 1.5Mbit down/384Kbit up who is 100ms away with 0% packet loss. This roughly corresponds to medium-speed ADSL on the other side of the U.S. from your servers. Shown here is the effective bandwidth while loading a page with many objects of a given size, with effective bandwidth defined as total object bytes received divided by the time to receive them:

[1.5megabit 100ms graph]

Interesting things to note:

Perhaps more clearly, the following is a graph of how much faster pages could load for an assortment of common access speeds and latencies with many external objects spread over four hostnames and keepalives enabled. Baseline (0%) is one hostname and keepalives disabled.

[Speedup of 4 hostnames and keepalives on]

Interesting things from that graph:

One more thing I examined was the effect of request size on effective bandwidth. The above graphs assumed 500 byte requests and 500 bytes of reply headers in addition to the object contents. How does changing that affect performance of our 1.5Mbit down/384Kbit up and 100ms away user, assuming we're already using four hostnames and keepalives?

[Effective bandwidth at various request sizes]

This shows that at small object sizes, we're bottlenecked on the upstream bandwidth. The browser sending larger requests (such as ones laden with lots of cookies) seems to slow the requests down by 40% worst-case for this user.

As I've said, these graphs are based on a simulation and don't account for a number of real-world factors. But I've unscientifically verified the results with real browsers on real net and believe them to be a useful gauge. I'd like to find the time and resources to reproduce these using real data collected from real browsers over a range of object sizes, access speeds, and latencies.

Measuring the effective bandwidth of your users

You can measure the effective bandwidth of your users on your site relatively easily, and if the effective bandwidth of users viewing your pages is substantially below their available downstream bandwidth, it might be worth attempting to improve this.

Before giving the browser any external object references (<img src="...">, <link rel="stylesheet" href="...">, <script src="...">, etc), record the current time. After the page load is done, subtract the time you began, and include that time in the URL of an image you reference off of your server.

Sample javascript implementing this:

<html>
<head>
<title>...</title>
<script type="text/javascript">
<!--
var began_loading = (new Date()).getTime();

function done_loading() {
 (new Image()).src = '/timer.gif?u=' + self.location + '&t=' + 
  (((new Date()).getTime() - began_loading) / 1000);
}
// -->
</script>
<!--
Reference any external javascript or stylesheets after the above block.
// -->
</head>
<body onload="done_loading()">
<!--
Put your normal page content here.
// -->
</body>
</html>
This will produce web log entries of the form:
10.1.2.3 - - [28/Oct/2006:13:47:45 -0700] "GET /timer.gif?u=http://example.com/page.html&t=0.971 HTTP/1.1" 200 49 ...

in this case, showing that for this user, loading the rest of http://example.com/page.html took 0.971 seconds. And if you know that the combined size of everything referenced from that page is 57842 bytes, 57842 bytes * 8 bits per byte / 0.971 seconds = 476556 bits per second effective bandwidth for that page load. If this user should be getting 1.5Mbit downstream bandwidth, there is substantial room for improvement.

Tips to reduce your page load time

After you gather some page-load times and effective bandwidth for real users all over the world, you can experiment with changes that will improve those times. Measure the difference and keep any that offer a substantial improvement.

Try some of the following:

The above list covers improving the speed of communication between browser and server and can be applied generally to many sites, regardless of what web server software they use or what language the code behind your site is written in. There is, unfortunately, a lot that isn't covered.

While the tips above are intended to improve your page load times, a side benefit of many of them is a reduction in server bandwidth and CPU needed for the average page view. Reducing your costs while improving your user experience seems it should be worth spending some time on.

-- Aaron Hopkins