Making a mobile connection

I just returned from Breaking Development Conference, an amazing gathering of many of the brightest minds in mobile web development. On the flight home I watched the video ($$) and slides from Rajiv Vijayakumar’s talk on Understanding Mobile Web Browser Performance at Velocity 2011. Rajiv works at Qualcomm where his team has done extensive performance analysis of the Android browser. Some of their findings include:

Android 2.2 has a max of only 4 HTTP connections which limits parallel downloads. (This was increased to 8 in Android 2.3 and 35 in Android 3.1 according to Browserscope.)
It supports pipelining for reduced HTTP overhead.
Android’s cache eviction is based on expiration date. This is a motivation for setting expiration dates 10+ years in the future.
Android closes TCP sockets after 6 seconds of inactivity.

This last bullet leads to an interesting discussion about the tradeoffs between power consumption and web performance.

Radio link power consumption

3G devices surfing the Web (do people still say “surfing”?) establish a radio link to the carrier’s cell tower. Establishing and maintaining the radio link consumes battery power. The following graph from Rajiv’s slides shows power consumption for an Android phone while loading a web page. It rises from a baseline of 200 mA to ~400 mA as the radio link is initialized. After the page is loaded the phone drops to 300 mA while the network is inactive. After 10 seconds of inactivity, the radio link reaches an idle state and power consumption returns to the 200 mA baseline level.

The takeaway from this graph is that closing the radio link sooner consumes less battery power. This graph shows that the radio link continues to consume battery power until 10 seconds of inactivity have passed. The 10 second radio link timer begins once the web page has loaded. But there’s also a 6 second countdown after which Android closes the TCP connection by sending a FIN packet. When Android sends the FIN packet the radio link timer resets and continues to consume battery power for another 10 seconds, resulting in a total of 16 seconds of higher battery consumption.

One of the optimizations Rajiv’s team made for the Android browser running on Qualcomm chipsets is to close the TCP connections after the page is done loading. By sending the FIN packet immediately, the radio link is closed after 10 seconds (instead of 16 seconds) resulting in longer battery life. Yay for battery life! But how does this affect the speed of web pages?

Radio link promotion & demotion

The problem with aggressively closing the phone’s radio link is that it takes 1-2 seconds to reconnect to the cell tower. The way the radio link ramps up and then drops back down is shown in the following figure from an AT&T Labs Research paper. When a web page is actively loading, the radio link is at max power consumption and bandwidth. After the radio link is idle for 5 seconds, it drops to a state of half power consumption and significantly lower bandwidth. After another 12 seconds of inactivity it drops to the idle state. From the idle state it takes ~2 seconds to reach full power and bandwidth.

These inactivity timer values (5 seconds & 12 seconds in this example) are sent to the device by the cell tower and thus vary from carrier to carrier. The “state machine” for promoting and demoting the radio link, however, is defined by the Radio Resource Control protocol with the timer values left to the carrier to determine. (The protocol dubs these timer values “T1″, “T2″, and “T3″. I just find that funny.) If the radio link is idle when you request a web page, you have to wait ~2 seconds before that HTTP request can be sent. Clearly, the inactivity timer values chosen by the carrier can have a dramatic impact on mobile web performance.

What’s your carrier’s state machine?

There’s an obvious balance, sort of a yin and yang, between power consumption and web performance for 3G mobile devices. If a carrier’s inactivity timer values are set too short, users have better battery life but are more likely to encounter a ~2 second delay when requesting a web page. If the carrier’s inactivity timer values are set too long, users might have a faster web experience but shorter battery life.

This made me wonder what inactivity timer values popular carriers used. To measure this I created the Mobile State Machine Test Page. It loads a 1 kB image repeatedly with increasing intervals between requests: 2, 4, 6, 11, 13, 16, and 20 seconds. The image’s onload event is used to measure the load time of the image. For each interval the image is requested three times, and the median load time is the one chosen. The flow is as follows:

choose the next interval i (e.g, “2″ seconds)
wait i seconds
measure t_start
request the image
measure t_end using the image’s onload
record t_end - t_start as the image load time
repeat steps 2-6 two more times and choose the median as the image load time for interval i
goto step 1 until all intervals have been tested

The image should take about the same time to load on every request for a given phone and carrier. Increasing the interval between requests is intended to see if the inactivity timer changes the state of the radio link. By watching for a 1-2 second increase in image load time we can reverse engineer the inactivity timer values for a given carrier.

I tweeted the test URL about 10 days ago. Since then people have run the test 460+ times across 71 carriers. I wrote some code that maps IP addresses to known carrier hostnames so am confident about 26 of the carriers; the others are self-reported. (Max Firtman recommended werwar for better IP-to-carrier mapping.) I’d love to keep gathering data so:

I encourage you to run the test!

The tabular results show that there is a step in image load times as the interval increases. (The load time value shown in the table is the median collected across all tests for that carrier. The number of data points is shown in the rightmost column.) I generated the chart below from a snapshot of the data from Sept 12.

The arrows indicate a stepped increase in image load time that could be associated with the inactivity timer for that carrier. The most pronounced one is for AT&T (blue) and it occurs at the 5 second mark. T-Mobile (yellow) appears to have an inactivity timer around 3 seconds. Vodafone is much larger at 15 seconds. Sprint and ~~AT&T~~ Verizon have similar profiles but the step is less pronounced.

There are many caveats about this study:

This is a small sample size.
The inactivity timer could be affected by other apps on the phone doing network activity in the background. I asked people to close all apps, but there’s no way to verify they did that.
A given carrier might have different kinds of networks (3G, 4G, etc.). Similarly, they might have different inactivity timer values in different regions. All of those different conditions would be lumped together under the single carrier name.

What’s the point?

Hats off to Rajiv’s team at Qualcomm for digging into Android browser performance. They don’t even own the browser but have invested heavily in improving the browser user experience. In addition to closing TCP connections once the page is loaded, they increased the maximum number of HTTP connections, improved browser caching, and more.

I want to encourage this holistic approach to mobile performance and will write about that in more depth soon. This post is pretty technical, but it’s important that mobile web developers have greater insight into the parts of the mobile experience that go beyond HTML and JavaScript – namely the device, carrier network, and mobile browser.

For example, in light of this information about inactivity timers, mobile web developers might choose to do a 1 pixel image request at a set interval that keeps the radio link at full bandwidth. This would shorten battery life, so an optimization would be to only do a few pings after which it’s assumed the user is no longer surfing. Another downside is that doing this would use more dedicated channels at the cell tower, worsening everyone’s experience.

The right answer is to determine what the tradeoffs are. What is the optimal value for these inactivity timers? Is there a sweet spot that improves web performance with little or no impact on battery life? How did the carriers determine the current inactivity timer values? Was it based on improving the user’s web experience? I would bet not, but am hopeful that a more holistic view to mobile performance is coming soon.