Converting a project from AMD to CJS with Recast

The source code to this tutorial is available on: https://github.com/skookum/recast-to-cjs

Many of our teams have bought into React and the approach it brings to writing UIs and organizing our applications. Early on, a group of our teams made the decision to go with an AMD implementation using require.js, but we’ve since learned the great benefits that come when using common.js and the npm ecosystem instead.

It was in this context that I recently took it upon myself to help convert these projects from AMD to CJS. I have some experience with working with a CSS AST from using Rework. I came across Ben Newman’s Recast project and needed a problem to try it out on.

The reason for Recast from the README:

What I hope to eliminate are the brain-wasting tasks, the tasks that are bottlenecked by keystrokes, the tasks that can be expressed as operations on the syntactic structure of your code. Specifically, my goal is to make it possible for you to run your code through a parser, manipulate the abstract syntax tree directly, subject only to the constraints of your imagination, and then automatically translate those modifications back into source code, without upsetting the formatting of unmodified code.

Converting from one module format syntax to another is the perfect problem to try this out on. In this tutorial we will:

The Syntactical Differences

AMD has a couple different function signatures that we were using (and a couple that we can ignore because we weren’t using them). They are documented at http://requirejs.org/docs/api.html#define

// with a list of dependencies
define(['react'], function(React) {
  // optional return value which is the module itself;
  return React.createClass({});
});

// with a variable listing dependencies. This is not recommended.
var DEPENDENCIES = ['react'];
define(DEPENDENCIES, function(React) {
  return React.createClass({});
});

// with no dependencies
require(function() {
  return {};
});

Each of these is elegantly defined in common.js as the following:

// with a list of dependencies
var React = require('react');
module.exports = React.createClass({});

// with a variable listing dependencies. This is not recommended.
var React = require('react');
module.exports = React.createClass({});

// with no dependencies
module.exports = {};

Hello world

What we need to be able to do is transform every file from one format to the other. Let’s begin by writing a few scripts that will read a file and print the output. At this stage of our script we want to be able to do the following in our terminal of choice and get the following back:

$ tocjs test/cases/identity.js

> define(function() {
>   return 'Hello world';
> });

There are a couple of boilerplate files to give us CLI and node interfaces.

In both cases, you give the function a glob and let it run an identity transform over this.

This is where our introduction to Recast begins.

lib/transformers/identity.js

var recast = require('recast');
module.exports = function identity(code) {
  var ast = recast.parse(code);
  return recast.print(ast).code;
};

recast.parse gives us back a Mozilla Parser API compatible abstract syntax tree (AST). What we want to be able to do is detect a define or require call and apply a transformation to it.

You can view the full AST of the simple require statement at this Gist: gist.github.com/iamdustan/7454050b765643085d57

Let’s begin by writing the builder functions first to get a feel for how to create the AST objects we need, then write the detection visitors.

Builder Objects

tldr; You can view the already completed work in this commit: Skookum/recast-to-cjs#0111362451a43d5c6f8378a7c9f38460f806e920

Recast includes the ast-types project which is our type system. There is a builder for everything you see on the MDN Parser API page.

Variable Assignment

// generate the following variable declaration:
//   var i = 0;
var b = require('ast-types').builders;

var program = b.variableDeclaration('var', [
  b.variableDeclarator(
    b.identifier('i'),
    b.literal(0)
  )
]);

Let’s inspect this inside out.

Simple enough, right?

Generating the commonjs require statements is only a slightly more complex scenario with the variable declarator receiving a call expression to the require function. See lib/generators/cjsrequire.js.

Member Assignment

To create the module.exports = right; code, we need to learn about a few more items. Rather than assigning to a local variable, we are assigning to an object member.

b.expressionStatement(b.assignmentExpression(
  '=', // any assignment operator, such as = += >>>=
  b.memberExpression(
    b.identifier('module'),
    b.identifier('exports'),
    false // isComputed ? `module[exports]` : `module.exports`
  ),
  value
));

Hopefully, that is pretty self explanatory after looking at the previous example. We have to create an assignment to a member expression. If you read it inside out you’ll see that we create the member expression module.exports and assign to that a value node.

Traversing the AST for AMD Nodes

Now that we have some familiarity with node types, we can begin visiting them. This is done using the visitor pattern. Generally, this looks like the following:

var ast = recast.parse(string);
recast.visit(ast, {
  visitNode: function(path) {
    // Visitor methods receive a NodePath
(https://github.com/benjamn/ast-types#nodepath) parameter, which has various
    // useful methods and properties, most importantly path.node.
    var node = path.node;

    // When you define a visitor method, you get to decide when and how
    // its children should be recursively visited, by calling this.traverse:
    this.traverse(path);
  },
  // all visitor functions are optional. The method name follows the pattern:
  // ['visit' + ASTType].
  visitFunctionDeclaration: function() { },
  visitExpressionStatement: function() { },
  // ....
});

var output = recast.print(ast).code;

Based on our earlier exploration of the AMD function signature we know we need to detect the following forms:

Visiting AMD Definitions

When you call a function such as define or require, you are using a CallExpression. As such, we need to visit these CallExpressions, and if it’s an AMD definition, transform it. You can see the full commit at Skookum/recast-to-cjs#2f21464a5f9524df2d9991db831a4e8cc93ec4e5.

var recast = require('recast');
var n = recast.types.namedTypes;

recast.visit(ast, {
  visitCallExpression: function(path) {
    var node = path.node;
    if (this.isAMDDefinition(node)) {
      this.visitAMDDefinition(path);
    }
    return this.traverse(path);
  },
  visitAMDDefinition: function(path) {
    // TODO: transform this to commonjs
    return this.traverse(path);
  },
  isAMDDefinition(node) {
    return isNamed('require') || isNamed('define');
    function isNamed(name) {
      return n.CallExpression.check(node) &&
        name === node.callee.name;
    }
  }
});

Transforming the module definition

Now that we have a module definition, we need to transform the factory function or object. The two function signatures we care about are:

define({my: 'object'});
// module.exports = {my: 'object'};

define([], function() {
  return 'my module';
});
// {
//   module.exports = 'my module';
// }

We’re intentionally leaving the body in an anonymous block. Even though recast does non-destructive transformations, we would like to minimize reindentation of code so that the resulting diff is easier to read.

Let’s break down our new requirements:

This commit solves for these requirements: Skookum/recast-to-cjs#46dd11252ad910343ea9a90aec8ffb705a0788d5

We already have the function to generate an exports expression, so now we just need to create a few helpers to transform the module.


  // this is called with an AMD definition
 transformedModuleBody: function(path) {
  var node = path.node;

  // `extractModuleBody` pulls out the last argument to the AMD node
  var module = this.extractModuleBody(path);
  if (module) {
    // if it's an object, we return the new `module.exports = {};` to the
visitor
    if (n.ObjectExpression.check(module)) {
      return generateExports(module);
    }
    // if it's an AMD Factory function, then we traverse the body to ensure we
    // visit any child ReturnStatements and transform them, then we return the
    // function body
    else if (n.FunctionExpression.check(body)) {
      this.traverse(path);
      return module.body;
    }
  }
  return path;
},

Extracting Dependencies

Now that we have our AMD definition, we need to extract any dependencies and the module itself.

The following covers the use cases we are going handle:

define(['a', 'b'], function(a) {
  return a.init();
});

// var a = require('a');
// require('b');
// module.exports = a.init();

Most of it is done in this commit: Skookum/recast-to-cjs#d1dc01c30f160172378cf662c0868cbd6ffe19be

Our transformedDependencies method returns an array of CommonJS expressions or undefined. This uses our previously written commonjs expression builder, and our extractAMDDependencies method, which looks up the dependency array and returns an array of tuples [dependencyIdentifier, optionalLocalVariableName].

I can Recast, and so can you

Ben Newman writes, “Instead of typing yourself into a nasty case of RSI, gaze upon your new wells of free time and ask yourself: what next?”

With all the time I saved automating our module system transformation, I had the opportunity to write this tutorial. Being aware that “this is a thing” and that you have the capability to use it is 90% of the solution.

Writing code is one thing, but writing code to write your code enables another dimension of power.


A huge thanks to Ben Newman and Mark Pedrotti for reviewing this article.

Interested in challenges like this? We’re hiring.