ECMA-262 » ECMA-262-3 in detail. Chapter 6. Closures.

Read this article in: Russian, Chinese.

In this article we will talk about one of the most discussed topics related with JavaScript — about closures. The topic, as a matter of fact, is not new and was discussed many times. However we will try to discuss and understand it more from theoretical point of view, and also will look at how closures are made in ECMAScript from within.

Two previous chapters devoted to scope chain and variable object can be good to consider first, since in this chapter we will use material discussed earlier.

Before the discussion of closures directly in ECMAScript, it is necessary to specify a number of definitions from the general theory of functional programming.

As is known, in functional languages (and ECMAScript supports this paradigm and stylistics), functions are data, i.e. they can be assigned to variables, passed as arguments to other functions, returned from functions etc. Such functions have special names and structure.

A functional argument (“Funarg”) — is an argument which value is a function.

Example:

function exampleFunc(funArg) {
  funArg();
}

exampleFunc(function () {
  alert('funArg');
});

The actual parameter related with the “funarg” in this case is the anonymous function passed to the exampleFunc function.

In turn, the function which receives another function as the argument is called a higher-order function (HOF).

Another name of a HOF is a functional or, closer to mathematics, an operator. In the example above, exampleFunc function is a higher-order function.

As it was noted, a function can be not only passed as an argument, but also returned as a value from another function.

The functions which return other functions are called functions with functional value (or function valued functions).

(function functionValued() {
  return function () {
    alert('returned function is called');
  };
})()();

Functions which can participate as normal data, i.e. be passed as arguments, receive functional arguments or be returned as functional values, are called first-class functions.

In ECMAScript all functions are first-class.

A function which receives itself as an argument, is called an auto-applicative (or self-applicative) function:

(function selfApplicative(funArg) {

  if (funArg && funArg === selfApplicative) {
    alert('self-applicative');
    return;
  }

  selfApplicative(selfApplicative);

})();

A function which returns itself is called an auto-replicative (or self-replicative) function. Sometimes, the name self-reproducing is used in a literature:

(function selfReplicative() {
  return selfReplicative;
})();

One of interesting patterns of self-replicative functions is a declarative form of working with a single argument of a collection instead of accepting the collection itself:

// imperative function
// which accepts collection

function registerModes(modes) {
  modes.forEach(registerMode, modes);
}

// usage
registerModes(['roster', 'accounts', 'groups']);

// declarative form using
// self-replicating function

function modes(mode) {
  registerMode(mode); // register one mode
  return modes; // and return the function itself
}

// usage: we just *declare* modes

modes
  ('roster')
  ('accounts')
  ('groups')

However, in practice working with the collection itself can be more efficient and intuitive.

Local variables which are defined in the passed functional argument are of course accessible at activation of this function, since the variable object which stores the data of the context is created every time on entering the context:

function testFn(funArg) {

  // activation of the funarg, local
  // variable "localVar" is available

  funArg(10); // 20
  funArg(20); // 30

}

testFn(function (arg) {

  var localVar = 10;
  alert(arg + localVar);

});

However, as we know from the chapter 4, functions in ECMAScript may be enclosed with parent functions and use variables from parent contexts. With this feature so-called a funarg problem is related.

In stack-oriented programming languages local variables of functions are stored on a stack which is pushed with these variables and function arguments every time when the function is called.

On return from the function the variables are removed from the stack. This model is a big restriction for using functions as functional values (i.e. returning them from parent functions). Mostly this problem appears when a function uses free variables.

A free variable is a variable which is used by a function, but is neither a parameter, nor a local variable of the function.

Example:

function testFn() {

  var localVar = 10;

  function innerFn(innerParam) {
    alert(innerParam + localVar);
  }

  return innerFn;
}

var someFn = testFn();
someFn(20); // 30

In this example localVar variable is free for the innerFn function.

If this system had use a stack-oriented model for storing local variables, it would mean that on return from testFn function all its local variables would be removed from the stack. And this would cause an error at innerFn function activation from the outside.

Moreover, in this particular case, in the stack-oriented implementation, returning of the innerFn function would not be possible at all, since innerFn is also local for testFn and therefore is also removed on returning from the testFn.

Another problem of functional objects is related with passing a function as an argument in a system with dynamic scope implementation.

Example (pseudo-code):

var z = 10;

function foo() {
  alert(z);
}

foo(); // 10 – with using both static and dynamic scope

(function () {

  var z = 20;
  foo(); // 10 – with static scope, 20 – with dynamic scope

})();

// the same with passing foo
// as an arguments

(function (funArg) {

  var z = 30;
  funArg(); // 10 – with static scope, 30 – with dynamic scope

})(foo);

We see that in systems with dynamic scope, variable resolution is managed with a dynamic (active) stack of variables. Thus, free variables are searched in the dynamic chain of the current activation — in the place where the function is called, but not in the static (lexical) scope chain which is saved at function creation.

And this can lead to ambiguity. Thus, even if z exists (in contrast with the previous example where local variables would be removed from a stack), there is a question: which value of z (i.e. z from which context, from which scope) should be used in various calls of foo function?

The described cases are two types of the funarg problem — depending on whether we deal with the functional value returned from a function (upward funarg), or with the functional argument passed to the function (downward funarg).

For solving this problem (and its subtypes) the concept of a closure was proposed.

A closure is a combination of a code block and data of a context in which this code block is created.

Let’s see an example in a pseudo-code:

var x = 20;

function foo() {
  alert(x); // free variable "x" == 20
}

// Closure for foo
fooClosure = {
  call: foo // reference to function
  lexicalEnvironment: {x: 20} // context for searching free variables
};

In the example above, fooClosure of course is a pseudo-code whereas in ECMAScript foo function already contains as one of its internal property a scope chain of a context in which it has been created.

The word “lexical” is often omitted, since goes without saying, and in this case it focuses attention that a closure saves its parent variables in the lexical place of the source code, that is — where the function is defined. At next activations of the function, free variables are searched in this saved (closured) context, that we can see in examples above where variable z always should be resolved as 10 in ECMAScript.

In definition we used a generalized concept — “the code block”, however usually the term “function” is used. Though, not in all implementations closures are associated only with functions: for example, in Ruby programming language, as a closure may be: a procedure object, a lambda-expression or a code block.

As to implementations, for storing local variables after the context is destroyed, the stack-based implementation is not fit any more (because it contradicts the definition of stack-based structure). Therefore in this case closured data of the parent context are saved in the dynamic memory allocation (in the “heap”, i.e. heap-based implementations), with using a garbage collector (GC) and references counting. Such systems are less effective by speed than stack-based systems. However, implementations may always optimize it: at parsing stage to find out, whether free variables are used in function, and depending on this decide — to place the data in the stack or in the “heap”.

Having discussed the theory, we at last have reached closures regarding directly ECMAScript. Here it is necessary to notice that ECMAScript uses only static (lexical) scope (whereas in some languages, for example in Perl, variables can be declared using both static or dynamic scope).

var x = 10;

function foo() {
  alert(x);
}

(function (funArg) {

  var x = 20;

  // variable "x" for funArg is saved statically
  // from the (lexical) context, in which it was created
  // therefore:

  funArg(); // 10, but not 20

})(foo);

Technically, the variables of a parent context are saved in the internal [[Scope]] property of the function. So if you completely understand the [[Scope]] and a scope chain topics, which in detail where discussed in the chapter 4, the question on understanding closures in ECMAScript will disappear by itself.

Referencing to algorithm of functions creation, we see that all functions in ECMAScript are closures, since all of them at creation save scope chain of a parent context. The important moment here is that, regardless — whether a function will be activated later or not — the parent scope is already saved to it at creation moment:

var x = 10;

function foo() {
  alert(x);
}

// foo is a closure
foo: <FunctionObject> = {
  [[Call]]: <code block of foo>,
  [[Scope]]: [
    global: {
      x: 10
    }
  ],
  ... // other properties
};

As we mentioned, for optimization purpose, when a function does not use free variables, implementations may not to save a parent scope chain. However, in ECMA-262-3 specification nothing is said about it; therefore, formally (and by the technical algorithm) — all functions save scope chain in the [[Scope]] property at creation moment.

Some implementations allow access to the closured scope directly. For example in Rhino, for the [[Scope]] property of a function, corresponds a non-standard property __parent__ which we discussed in the chapter about variable object:

var global = this;
var x = 10;

var foo = (function () {

  var y = 20;

  return function () {
    alert(y);
  };

})();

foo(); // 20
alert(foo.__parent__.y); // 20

foo.__parent__.y = 30;
foo(); // 30

// we can move through the scope chain further to the top
alert(foo.__parent__.__parent__ === global); // true
alert(foo.__parent__.__parent__.x); // 10

It is necessary to notice that closured [[Scope]] in ECMAScript is the same object for the several inner functions created in this parent context. It means that modifying the closured variable from one closure, reflects on reading this variable in another closure.

That is, all inner functions share the same parent scope.

var firstClosure;
var secondClosure;

function foo() {

  var x = 1;

  firstClosure = function () { return ++x; };
  secondClosure = function () { return --x; };

  x = 2; // affection on AO["x"], which is in [[Scope]] of both closures

  alert(firstClosure()); // 3, via firstClosure.[[Scope]]
}

foo();

alert(firstClosure()); // 4
alert(secondClosure()); // 3

There is a widespread mistake related with this feature. Often programmers get unexpected result, when create functions in a loop, trying to associate with every function the loop’s counter variable, expecting that every function will keep its “own” needed value.

var data = [];

for (var k = 0; k < 3; k++) {
  data[k] = function () {
    alert(k);
  };
}

data[0](); // 3, but not 0
data[1](); // 3, but not 1
data[2](); // 3, but not 2

The previous example explains this behavior — a scope of a context which creates functions is the same for all three functions. Every function refers it through the [[Scope]] property, and the variable k in this parent scope can be easily changed.

Schematically:

activeContext.Scope = [
  ... // higher variable objects
  {data: [...], k: 3} // activation object
];

data[0].[[Scope]] === Scope;
data[1].[[Scope]] === Scope;
data[2].[[Scope]] === Scope;

Accordingly, at the moment of function activations, last assigned value of k variable, i.e. 3 is used.

This relates to the fact that all variables are created before the code execution, i.e. on entering the context. This behavior is also known as “hosting”.

Creation of additional enclosing context may help to solve the issue:

var data = [];

for (var k = 0; k < 3; k++) {
  data[k] = (function _helper(x) {
    return function () {
      alert(x);
    };
  })(k); // pass "k" value
}

// now it is correct
data[0](); // 0
data[1](); // 1
data[2](); // 2

Let’s see what has happened in this case.

First, the function _helper is created and immediately activated with the argument k.

Then, returned value of the _helper function is also a function, and exactly it is saved to the corresponding element of the data array.

This technique provides the following effect: being activated, the _helper every time creates a new activation object which has argument x, and the value of this argument is the passed value of k variable.

Thus, the [[Scope]] of returned functions is the following:

data[0].[[Scope]] === [
  ... // higher variable objects
  AO of the parent context: {data: [...], k: 3},
  AO of the _helper context: {x: 0}
];

data[1].[[Scope]] === [
  ... // higher variable objects
  AO of the parent context: {data: [...], k: 3},
  AO of the _helper context: {x: 1}
];

data[2].[[Scope]] === [
  ... // higher variable objects
  AO of the parent context: {data: [...], k: 3},
  AO of the _helper context: {x: 2}
];

We see that now the [[Scope]] property of functions have the reference to the needed value — via the x variable which is captured by the additionally created scope.

Notice, that from the returned functions we still may of course reference k variable — with the same correct for all functions value 3.

Often JavaScript closures incompletely reduced only to the showed above pattern — with creation of the additional function to capture the needed value. From the practical viewpoint, this pattern really is known, however, from the theoretical viewpoint as we noted, all functions in ECMAScript are closures.

The described pattern is not a unique though. To get the needed value of k variable is also possible, for example, using the following approach:

var data = [];

for (var k = 0; k < 3; k++) {
  (data[k] = function () {
    alert(arguments.callee.x);
  }).x = k; // save "k" as a property of the function
}

// also everything is correct
data[0](); // 0
data[1](); // 1
data[2](); // 2

Another feature is returning from closures. In ECMAScript, a return statement from a closure returns the control flow to a calling context (a caller). In other languages, for example in Ruby, various forms of closures, which process return statement differently, are possible: it may be return to a caller, or in others cases — a full exit from an active context.

ECMAScript standard return behavior:

function getElement() {

  [1, 2, 3].forEach(function (element) {

    if (element % 2 == 0) {
      // return to "forEach" function,
      // but not return from the getElement
      alert('found: ' + element); // found: 2
      return element;
    }

  });

  return null;
}

alert(getElement()); // null, but not 2

Though, in ECMAScript in such case throwing and catching of some special “break”-exception may help:

var $break = {};

function getElement() {

  try {

    [1, 2, 3].forEach(function (element) {

      if (element % 2 == 0) {
        // "return" from the getElement
        alert('found: ' + element); // found: 2
        $break.data = element;
        throw $break;
      }

    });

  } catch (e) {
    if (e == $break) {
      return $break.data;
    }
  }

  return null;
}

alert(getElement()); // 2

As we noted, often programmers incompletely reduced closures only to inner functions returned from parent context. Even more incomplete reduction of closures can be only to anonymous functions.

Let’s make a note again, that all functions, independently from their type: anonymous, named, function expression or function declaration, because of the scope chain mechanism, are closures.

An exception to this rule are functions created via Function constructor which [[Scope]] contains only global object.

And to clarify this question, let’s provide two correct versions of closures regarding ECMAScript:

Closures in ECMAScript are:

from the theoretical viewpoint: all functions, since all they save at creation variables of a parent context. Even a simple global function, referencing a global variable refers a free variable and therefore, the general scope chain mechanism is used;
from the practical viewpoint: those functions are interesting which:
- continue to exist after their parent context is finished, e.g. inner functions returned from a parent function;
- use free variables.

In practice closures may create elegant designs, allowing customization of various calculations defined by a “funarg”. An example the sort method of arrays which accepts as an argument the sort-condition function:

[1, 2, 3].sort(function (a, b) {
  ... // sort conditions
});

Or, for example, so-called, mapping functionals as the map method of arrays which maps a new array by the condition of the functional argument:

[1, 2, 3].map(function (element) {
  return element * 2;
}); // [2, 4, 6]

Often it is convenient to implement search functions with using functional arguments defining almost unlimited conditions for search:

someCollection.find(function (element) {
  return element.someProperty == 'searchCondition';
});

Also, we may note applying functionals as, for example, a forEach method which applies a function to an array of elements:

[1, 2, 3].forEach(function (element) {
  if (element % 2 != 0) {
    alert(element);
  }
}); // 1, 3

By the way, methods of function objects apply and call, also originate in applying functionals of functional programming. We already discussed these methods in a note about this value; here, we see them in a role of applying functionals — a function is applied to arguments (to a list of arguments — in apply, and to positioned arguments — in call):

(function () {
  alert([].join.call(arguments, ';')); // 1;2;3
}).apply(this, [1, 2, 3]);

Another important application of closures are deferred calls:

var a = 10;
setTimeout(function () {
  alert(a); // 10, after one second
}, 1000);

And also callback functions:

...
var x = 10;
// only for example
xmlHttpRequestObject.onreadystatechange = function () {
  // callback, which will be called deferral ,
  // when data will be ready;
  // variable "x" here is available,
  // regardless that context in which,
  // it was created already finished
  alert(x); // 10
};
..

Or e.g. creation of an encapsulated scope for the purpose of hiding auxiliary objects:

var foo = {};

// initialization
(function (object) {

  var x = 10;
  
  object.getX = function _getX() {
    return x;
  };

})(foo);

alert(foo.getX()); // get closured "x" – 10

This article has turned out more about the general theory than about ECMA-262-3 specification, however, I think that this general theory can better help to clarified some aspects and allow to get closer ECMAScript functions. If you have questions, I will answer them with pleasure in comments.

Translated by: Dmitry Soshnikov.
Published on: 2010-02-28

Originally written by: Dmitry Soshnikov [ru, read »]
Originally published on: 2009-07-20 [ru]

Tags: Closure, ECMA-262-3, ECMAScript, First-class objects, Funarg, Functional programming