javascript - Why does Google prepend while(1); to their JSON responses?

It prevents disclosure of the response through JSON hijacking.

In theory, Google's JSON responses are protected by the Same Origin Policy: pages from one domain cannot get any informations from pages on an other domain (unless explicitly allowed).

An attacker can request pages on other domains on your behalf, e.g. by using a <script src=...> or <img>tag, but it can't get any information about the result (headers, contents).

Thus, an attacker's page couldn't read your email from while visiting it.

Except that when using a script tag to request JSON content, the JSON is executed as Javascript in an attacker's controlled environment. If the attacker can replace the Array or Object contructor or some other method used during object construction, anything in the JSON would pass through the attacker's code, and be disclosed.

Note that this happens at the time the JSON is executed as Javascript, not at the time it's parsed.

There are multiple counter measures:

By placing a while(1); statement before the JSON data, Google makes sure that the JSON data is never executed as Javascript.

Only a legitimate page could actually get the whole content, strip the while(1);, and parse the remainder as JSON.

Similarly, adding invalid tokens before the JSON, like &&&START&&&, makes sure that it is never executed.

This is OWASP recommended way to protect from JSON hijacking, and is the less intrusive one.

Similarly to the the previous counter-measures, it makes sure that the JSON is never executed as Javascript.

A valid JSON object, when not enclosed by anything, is not valid in Javascript:

// SyntaxError: Unexpected token :

This is however valid JSON:

// Object {foo: "bar"}

So, making sure you always return an Object at the top level of the response makes sure that the JSON is not valid Javascript, while still being valid JSON.

The OWASP way is less intrusive, as it needs no client library changes, and transfers valid JSON. It is unsure whether past or future browser bugs could defeat this, however.

Google's way requires client library in order for it to support automatic de-serialization, and can be considered to be safer with regard to browser bugs.

Both methods require server changes in order avoid developers from accidentally sending vulnerable JSON.