Combining JavaScript Arrays

This is a quickie simple post on JavaScript techniques. We're going to cover different methods for combining/merging two JS arrays, and the pros/cons of each approach.

Let's start with the scenario:

var a = [ 1, 2, 3, 4, 5, 6, 7, 8, 9 ];
var b = [ "foo", "bar", "baz", "bam", "bun", "fun" ];

The simple concatenation of a and b would, obviously, be:

[
   1, 2, 3, 4, 5, 6, 7, 8, 9,
   "foo", "bar", "baz", "bam" "bun", "fun"
]

concat(..)

The most common approach is:

var c = a.concat( b );

a; // [1,2,3,4,5,6,7,8,9]
b; // ["foo","bar","baz","bam","bun","fun"]

c; // [1,2,3,4,5,6,7,8,9,"foo","bar","baz","bam","bun","fun"]

As you can see, c is a whole new array that represents the combination of the two a and b arrays, leaving a and b untouched. Simple, right?

What if a is 10,000 items, and b is 10,000 items? c is now 20,000 items, which constitutes basically doubling the memory usage of a and b.

"No problem!", you say. We just unset a and b so they are garbage collected, right? Problem solved!

a = b = null; // `a` and `b` can go away now

Meh. For only a couple of small arrays, this is fine. But for large arrays, or repeating this process regularly a lot of times, or working in memory-limited environments, it leaves a lot to be desired.

Looped Insertion

OK, let's just append one array's contents onto the other, using Array#push(..):

// `b` onto `a`
for (var i=0; i < b.length; i++) {
    a.push( b[i] );
}

a; // [1,2,3,4,5,6,7,8,9,"foo","bar","baz","bam","bun","fun"]

b = null;

Now, a has the result of both the original a plus the contents of b.

Better for memory, it would seem.

But what if a was small and b was comparitively really big? For both memory and speed reasons, you'd probably want to push the smaller a onto the front of b rather than the longer b onto the end of a. No problem, just replace push(..) with unshift(..) and loop in the opposite direction:

// `a` into `b`:
for (var i=a.length-1; i >= 0; i--) {
    b.unshift( a[i] );
}

b; // [1,2,3,4,5,6,7,8,9,"foo","bar","baz","bam","bun","fun"]

a = null;

Functional Tricks

Unfortunately, for loops are ugly and harder to maintain. Can we do any better?

Here's our first attempt, using Array#reduce:

// `b` onto `a`:
a = b.reduce( function(coll,item){
    coll.push( item );
    return coll;
}, a );

a; // [1,2,3,4,5,6,7,8,9,"foo","bar","baz","bam","bun","fun"]

// or `a` into `b`:
b = a.reduceRight( function(coll,item){
    coll.unshift( item );
    return coll;
}, b );

b; // [1,2,3,4,5,6,7,8,9,"foo","bar","baz","bam","bun","fun"]

Array#reduce(..) and Array#reduceRight(..) are nice, but they are a tad clunky. ES6 => arrow-functions will slim them down slightly, but it's still requiring a function-per-item call, which is unfortunate.

What about:

// `b` onto `a`:
a.push.apply( a, b );

a; // [1,2,3,4,5,6,7,8,9,"foo","bar","baz","bam","bun","fun"]

// or `a` into `b`:
b.unshift.apply( b, a );

b; // [1,2,3,4,5,6,7,8,9,"foo","bar","baz","bam","bun","fun"]

That's a lot nicer, right!? Especially since the unshift(..) approach here doesn't need to worry about the reverse ordering as in the previous attempts. ES6's spread operator will be even nicer: a.push( ...b ) or b.unshift( ...a ).

But, things aren't as rosy as they might seem. In both cases, passing either a or b to apply(..)'s second argument (or via the ... spread operator) means that the array is being spread out as arguments to the function.

The first major problem is that we're effectively doubling the size (temporarily, of course!) of the thing being appended by essentially copying its contents to the stack for the function call. Moreover, different JS engines have different implementation-dependent limitations to the number of arguments that can be passed.

So, if the array being added on has a million items in it, you'd almost certainly way exceed the size of the size of the stack allowed for that push(..) or unshift(..) call. Ugh. It'll work just fine for a few thousand elements, but you have to be careful not to exceed a reasonably safe limit.

Note: You can try the same thing with splice(..), but you'll have the same conclusions as with push(..) / unshift(..).

One option would be to use this approach, but batch up segments at the max safe size:

function combineInto(a,b) {
    var len = a.length;
    for (var i=0; i < len; i=i+5000) {
        b.unshift.apply( b, a.slice( i, i+5000 ) );
    }
}

Wait, we're going backwards in terms of readability (and perhaps even performance!). Let's quit before we give up all our gains so far.

Summary

Array#concat(..) is the tried and true approach for combining two (or more!) arrays. But the hidden danger is that it's creating a new array instead of modifying one of the existing ones.

There are options which modify-in-place, but they have various trade-offs.

Giving the various pros/cons, perhaps the best of all of the options (including others not shown) is the reduce(..) and reduceRight(..).

Whatever you choose, it's probably a good idea to critically think about your array merging strategy rather than taking it for granted.

Kyle Simpson