Stacked area charts in d3

Published
13 June 2019

You can do stacked area charts in d3 using d3.stack. It's pretty cool:

d3.csv("wide-data.csv").then(function(data) {
  let svg = d3.select(".ex1");

  // Formatting and scaling code, and then...

  let stacker = d3.stack()
    .keys(["Apples", "Bananas", "Canteloupes"]);

  let stackedData = stacker(data);

  let areaGen = d3.area()
    .x((d,i) => x(i))
    .y0(d => y(d[0]))
    .y1(d => y(d[1]));

  let fillColours = ["red", "blue", "green"];

  svg.selectAll("path")
    .data(stackedData)
    .enter()
      .append("path")
      .attr("d", series => areaGen(series))
      .style("fill", (d,i) => fillColours[i]);
  });

It’s not the easiest thing to use, especially if you’re a worshipper at the [altar of tidy data][tidydata]. So how do you turn a long array of data, with categories encoded in their own column, into a stacked area chart?

The good news is that it’s not hard. It’s just a bit long-winded.

First, find your areas...

A stacked area chart is basically a set of polygons, which we can handily plot using d3's area generator. But first, we need to determine the "stacking order" of our chart: which series should go at the bottom, which should go on top, etc. This is a bit subjective, depending on what you're showing, but an adequate rule of thumb (at least for right now) is that the biggest areas should be on the bottom.

// This data has columns: TimePeriod, Fruit, Value
d3.csv(
  "long-data.csv",
  function(row){
    //Convert to numeric values
    row.Value = +row.Value;
    row.TimePeriod = +row.TimePeriod;
    return row;
  }).then(function(data) {

    // Sum up totals for each fruit
    let fruitSums = d3.nest()
      .key(d => d.Fruit)
      .rollup(d => d.map(e => e.Value).reduce((a,b) => a + b))
      .entries(data);

    // > [
    // >   0: {key: "Canteloupes", value: 182.99031444}
    // >   1: {key: "Apples", value: 314.94202225000004}
    // >   2: {key: "Bananas", value: 327.70480420999996}
    // > ]

    // We do b - a because we wish to sort in descending order.
    let fruitOrder =
      fruitSums
      .sort((a,b) => b.value - a.value)
      .map(a => a.key);

    // > ["Bananas", "Apples", "Canteloupes"]
});

Now we know the order we want: Bananas on the bottom, Canteloupes on the top. But we're still faced with the issue of how we stack the areas on top of one another.

This bit is kind of clever, even while being pretty straightforward. For each time period, we calculate two values for every series:

  • The floor, which is the sum of all values "below" this value on the stack.
  • The ceiling, which is the sum of the floor and this series' value at this point.

It sounds easy, but it's a bit tricky. We're going to make some clever shortcuts, but we're still going to have to group this data by time period, and ungroup it again:

// Inside the csv call above...
let stackData = d3.nest()
  .key(d => d.TimePeriod)
  .rollup(function(d) {
    let sortedData = d.sort((a,b) => fruitOrder.indexOf(a.Fruit) - fruitOrder.indexOf(b.Fruit));
    let sortedValues = sortedData.map(d => d.Value);

    return sortedData.map(function(row, i) {
      let floor = i == 0 ? 0 : sortedValues.slice(0, i).reduce((a,b) => a + b);
      let ceiling = sortedValues.slice(0, i + 1).reduce((a,b) => a + b);

      return {
        "Fruit": row.Fruit,
        "TimePeriod": row.TimePeriod,
        "Floor": floor,
        "Ceiling": ceiling
      };
    });
  })
  .entries(data);

// Ungroup data!
stackData = stackData.map(kv => kv.values).reduce((x,y) => x.concat(y));

What's happening here? First, we nest the data by time period (because we're stacking up data for each time period). We take data for each time period, and sort is according to the order we specified above. We then make an (ordered) list of values to graph, and, going through the data, create the floor and ceiling variables.

Once we've done all that, we have a dictionary that looks like the following:

[
  {
    key: "1",
    value: [
      0: {Fruit: "Bananas", TimePeriod: 1, Floor: 0, Ceiling: 17.44394906}
      1: {Fruit: "Apples", TimePeriod: 1, Floor: 17.44394906, Ceiling: 28.056471180000003}
      2: {Fruit: "Canteloupes", TimePeriod: 1, Floor: 28.056471180000003, Ceiling: 36.370769893}
    ]
  },
  {
    key: "2",
    ...
  }
]

In the last line of the code above, we extract the value object and concatenate each array together, to give our final un-nested array of data.

Then stack them!

This part is pretty easy: we use the above Floor and Ceiling values to produce an area chart.

let svg = d3.select(".ex2");

svg.attr("viewBox", "0 0 40 20");

let x = d3.scaleLinear()
  .domain(d3.extent(stackData, d => d.TimePeriod))
  .range([0, 40]);

let y = d3.scaleLinear()
  .domain([d3.min(stackData, d => d.Floor), d3.max(stackData, d => d.Ceiling)])
  .range([20, 0]);

let areaGen = d3.area()
  .x (d => x(d.TimePeriod))
  .y0(d => y(d.Floor))
  .y1(d => y(d.Ceiling));

let fillColours = ["red", "blue", "green"];

// Add data, nested by fruit
let nestedData = d3.nest()
  .key(d => d.Fruit)
  .entries(stackData);

svg.selectAll("path")
  .data(nestedData, d => d.key)
  .enter()
    .append("path")
    .attr("d", series => areaGen(series.values))
    .style("fill", (d,i) => fillColours[i]);

Let's see it in action!

The graph below is generated using all the code we wrote above. Want to see what it looks like? Check out the source.

What next?

The example I've shown here is pretty specific - if you change the format of the data, you're going to have to change the code. We're nowhere near the d3.longStack() example posted above. What we have, though, is a proof-of-concept: we've gone from an idea to an actual implementation.

The process of making a general function for the long-data stacked area chart is left as an exercise to the reader.