;
Caveat: this is a Blink / Chrome view of the world. Most of the main thread tasks are “shared” in some form by all vendors, like layout or style calcs, but this overall architecture may not be.
It really does, so let’s start with one of those:
That’s a lot of content in a small space, so let’s define things a little more. It can be helpful to have the diagram above alongside these definitions, so maybe fire that up image next to this post or, for retro-old-skool points you could, you know, print it out. Sorry. Forget I mentioned it… Sorry.
Let’s start with the processes:
Now let’s look at the threads in the Renderer Process.
In many ways you should consider the Compositor Thread as the “big boss”. While it doesn’t run the JavaScript, Layout, Paint or any of that, it’s the thread that is wholly responsible for initiating main thread work, and then shipping frames to screen.
In many ways you should consider the Compositor Thread as the “big boss”. While it doesn’t run the JavaScript, Layout, Paint or any of that, it’s the thread that is wholly responsible for initiating main thread work, and then shipping frames to screen. If it doesn’t have to wait on input event handlers, it can ship frames while waiting for the Main thread to complete its work.
You can also imagine Service Workers and Web Workers living in this process, though I’m leaving them out to because it makes things way more complicated.
Let’s step through the flow, from vsync to pixels, and talk about how things work out in the “full-fat” version of events. It’s worth remembering that a browser need not execute all of these steps, depending on what’s necessary. For example, if there’s no new HTML to parse, then Parse HTML won’t fire. In fact, oftentimes the best way to improve performance is simply to remove the need for parts of the flow to be fired!
It’s also worth noting those red arrows just under styles and layout that seem to point towards requestAnimationFrame
. It’s perfectly possible to trigger both by accident in your code. This is called Forced Synchronous Layout (or Styles, depending), and it’s often bad for performance.
oftentimes the best way to improve performance is simply to remove the need for parts of the flow to be fired!
touchmove
, scroll
, click
) should fire first, once per frame, but that’s not necessarily the case; a scheduler makes best-effort attempts, the success of which varies between Operating Systems. There’s also some latency between the user interaction and the event making its way to the main thread to be handled.requestAnimationFrame
. This is the ideal place to make visual updates to the screen, since you have fresh input data, and it’s as close to vsync as you’re going to get. Other visual tasks, like style calculations, are due to come after this task, so it’s ideally placed to mutate elements. If you mutate – say – 100 classes, this won’t result in 100 style calculations; they will be batched up and handled later. The only caveat is that you don’t query any computed styles or layout properties (like el.style.backgroundImage
or el.style.offsetWidth
). If you do you’ll bring recalc styles, layout, or both, forward, causing forced synchronous layouts or, worse, layout thrashing.appendChild
.will-change
, overlapping elements, and any hardware accelerated canvases.requestIdleCallback
can fire. This is a great opportunity to do non-essential work, like beaconing analytics data. If you’re new to requestIdleCallback
have a primer for it on Google Developers that gives a bit more of a breakdown.There are two versions of depth sorting that crop up in the workflow.
Firstly, there’s the Stacking Contexts, like if you have two absolutely positioned divs that overlap. Update Layer Tree is the part of the process that ensures that z-index
and the like is heeded.
Secondly, there’s the Compositor Layers, which is later in the process, and applies more to the idea of painted elements. An element can be promoted to a Compositor Layer with the null transform hack, or will-change: transform
, which can then be transformed around the place cheaply (good for animation!). But the browser may also have to create additional Compositor Layers to preserve the depth order specified by z-index and the like if there are overlapping elements. Fun stuff!
Virtually all of the process outlined above is done on the CPU. Only the last part, where tiles are uploaded and moved, is done on the GPU.
On Android, however, the pixel flow is a little different when it comes to Rasterization: the GPU is used far more. Instead of Compositor Tile Workers doing the rasterization, the draw calls are executed as GL commands on the GPU in shaders.
This is known as GPU Rasterization, and it’s one way to reduce the cost of paint. You can find out if your page is GPU rasterized by enabling the FPS Meter in Chrome DevTools:
There’s a ton of other stuff that you might want to dive into, like how to avoid work on the Main Thread, or how this stuff works at a deeper level. Hopefully these will help you out: