I like to make things simple. Ok, if you’ve skimmed ahead you’re already raising an eyebrow. Please bear with me. I promise that I’ll show you another way to easily convince the business to invest in performance.
When the business asks us why we should care about performance we always point them to the research done by the commerce giants. Performance means revenue! Yea!
The benefits to the business doesn’t have to stop at revenue. Performance can also improve the bottom line. This is especially important in situations where it is harder to prove revenue impacts from performance improvements. Focusing on infrastructure and operational savings can make it easier to convince VIPs to invest in performance (while also having the added benefit of having happier and more productive users)
I’d like to share with you a financial model that I’ve used to show how performance can impact the cost of doing business.
As with all things, there always footnotes, caveats and provisos.
First, this is just another tool to demonstrate why performance matters – a tool that compares before & after. Using financial modeling can easily lead you into a rathole. There are always details that you will need to defend. Your objective should be to show directionality, not absolute position. (Let your financial experts in your organization compute the actual numbers.)
Second, I’m going to use shortcuts and generalities. This is based on my experience owning a business, my years managing Infrastructure and Operations and the many conversations I’ve had with other managers of I&O. I use these shortcuts to, again, show directionality. Don’t mistake shortcut to mean inferior. To the contrary! If anything, using the shortcuts will give conservative numbers.
The root premise of this financial model is that improving performance per webpage (or per transaction) is generally accomplished by:
Number 3 is often tightly connected with #1 and 2. That is you are either building out more hardware or you are optimizing what you have. Building out more hardware per interaction doesn’t scale with user growth.
Therefore, this model will work best when you are optimizing backend processes, adding caching layers (back-end, cdn, client) or optimizing user workflows.
In contrast this likely will fall down if you try to use it to argue for optimizations such as leveraging the GPU for client rendering, adding webp support.
Of course, this is just the beginning. There are many other financial models that you should consider. I’ll leave those for another post! For example:
The financial model I use boils down to a basic equation:
Cash_Flow = Capital_Exenses + Operational_Expenses
That is, how much money do you have to spend to buy new hardware (CapEx) and how much money do you have to spend to keep the hardware working and the electricity flowing (OpEx).
Each year in the model we will add new hardware, which will increase our operational costs.
Later on I might use Max PageView as a proxy for load and will compute the peak CashFlow/PV:
(OpEx + CapEx) / Max_PageView
Once we have these three data points projected, it is as simple as comparing different scenarios. Did your improvements slow the rate of new hardware purchased? Reduce the operating costs? What are the projected costs with and without improved performance?
The tricky part is computing the OpEx and CapEx. Here are the equations that I will be using:
CapEx = Number_of_New_Servers * Average_Server_Cost OpEx = Number_of_Servers * Average_Server_KVA * CoLo_Cost_per_KVA
You’re probably thinking about a million variables and inputs I should be using in the numbers above. As I mentioned above, I’m going to stick with generalizations and avoid all the particulars. However, one thing I’m going to stress is that I’m avoiding all the funny numbers – soft costs, contract renegotiations, etc. This will allow you to bypass a 7 week discussion with your procurement about the true cost of your enterprise agreement.
For these reasons, I will intentionally avoid:
(This is truest in funny money. Users and staff will be as productive with the time available. You will not get this money back – but you might be able to invest this time in other activities.)
(the true cost of an enterprise agreement could power an improbability drive – calculating the savings will require the power from a small star)
Capital Expenditure is the easiest item to calculate.
We want to make sure we are just capturing the cost of procuring the hardware and getting it installed. Once procured, it is a “Sunk Cost” and can’t be recouperated. A more sophisticated model could turn this into an amortization schedule – but we want to show Cash Flow impacts instead.
CapEx = Number_of_Servers * Average_Cost
To be clear, when we talk about the Average Cost of hardware we should think of it in two ways:
Some good numbers I’ve used are:
Virtual Servers also require CapEx. You have two options, one is to do the translation of number of VMs per server to actual hardware (don’t forget to factor in vMotion buffer). Or, if your IT group has the cost already computed, you can use the cost per VM. Bottom line: be consistent.
There are many ways to calculate the cost of operations for your application. The easiest way is to look at it in an aggregate view. That is, how many servers in total are used to deliver your app – regardless of the role.
The assumption we start with is that your current infrastructure is necessary to deliver the current level of performance. Increasing user traffic will likewise need to increase your infrastructure proportionally.
Most Co-Location providers these days use a simple billing model of charging only for energy used. The beauty of this model is that it usually includes everything you need for hosting as well. Functionally you can assume for the price of energy you get all the cooling, floor space and bandwidth you need.
Of course, each datacenter is a special snowflake. Don’t get bogged down in the details and keep the formula general. It is better to underestimate than to overestimate or worse, spend 6 weeks and arrive at a similar number.
The equation works out to:
OpEx = Number_of_Servers * KVA_per_Server * KVA_Price
The KVA per server can be the most challenging to calculate. Some hardware manufacturers provide a power calculator for server configurations. Many provide a range of potentials. Here is my recommendation:
KVA = Watts / 900
As I mentioned, most colo providers charge by electricity used and bundle all the other amenities into this price. Like all colo solutions there is a range of offerings from high-end
($0.70/KVA/mo) to low-end
($0.20/KVA/mo). In my experience, I’ve found that the cost to run your own datacenter can be pretty close to the average cost of renting colo space.
You’ll probably have a hard time getting procurement to offer up the price you pay for colo, so to save you the time I’d recommend using a number around
$0.50/KVA/mo. This should also be sufficiently padded to account for any MPLS lines or dedicated circuits that your data center might need.
So far this model assumes you own or lease your infrastructure. However, if you use IaaS this model will fall short since you don’t own capital and it is pure operational expenses. The tricky part is that the cost of operations is not based on hardware procured but based on utilization. Savings can still be realized and modeled but it requires a slightly different formula and ultimately requires more insight into your cost of operations. This is worthy of a different talk.
A useful model to measure user load on a system is to look the maximum Page Views per second. Consider a Page View requests that return
Content-Type: text/html. The principle is that each user ‘interaction’ or ‘transaction’ with your website will return html.
Using PageView isn’t always perfect – especially for single url apps. The goal is to find a metric that everyone can agree on and consistently represents the volume of user activity on your site.
Your current configuration of infrastructure is designed to meet a peak in volume of traffic. That is, you have built it to sustain the peak traffic throughout the year and get by with the least number of user complaints. This peak could be Black Friday or it could be annual performance review time.
Using the maximum page view per second will tell us how much money is spent to maintain this peak traffic:
Interaction_Cost = (OpEx + CapEx) / Page_View_per_Second
Each year, you expect to grow the business. As you grow, you will build more hardware in lock step. If you do nothing to improve the performance, then you should expect the Interaction Cost to remain constant, year over year.
Let me share an example use of this model – based on real life events.
In this example, let’s assume a retailer, with this configuration:
The problem is that the home page and on key category page account for 40% of the site traffic – all of which cannot be cached by the local varnish or cdn layers and must go back to the datacenter. This is because:
You’re probably shaking your head. I know. But this kind of web application is all too common.
With even a small TTL (eg: 1min) will result increase the offload from the datacenter by 40%. Assuming we don’t turn off any previously commissioned hardware, we can delay expanding the data center footprint by one year!
The results are more than enough to justify the cost of investment. The best part is that these projected savings don’t include any costs by the infrastructure teams to maintain the growing footprint. Your I/O teams will approve!
Plugging the numbers in we can see the cash spent this year (year 0) and the projected cash flow for next year if we don’t make the changes. In contrast we now have shown how our performance improvement will impact the total cost of ownership
Looking at it another way, we can see that the cost per interaction also decreases.
The example I gave shows how increasing caching impacts operations but it doesn’t stop there. Any change you make that makes applications more efficient, takes advantage of caches, reduces number of requests or makes requests smaller will impact the cost to the business. Using this simple model we can project the financial impact of those performance improvements.
This is only the beginning. I believe that there are many other financial models that can be used to help convince the business that performance matters!
A copy of my financial model can be found on google docs