Aaron Hipple

Last week, while helping another developer through the process of building a web component using Riot.js, I found myself speaking about a lot of restrictions that I tend to impose on my own programming that sounded arbitrary as I said them, particularly in the context of a small component.

One thing that seemed to both stick out and bind these restrictions together was the concept that almost everything you write when programming represents an API and that the general principles of API design are always worth considering. This can be a bit daunting, particularly to web developers who are accustomed to thinking of, say, "a REST API", some large-scale piece of software architecture.

We needn't concern ourselves with that kind of scale – in the general sense, an API is the "shape" of a unit of software (a web service, a library, or even just one function), particularly as viewed from another unit of software. This "shape" encompasses both the inputs to the software in question as well as its outputs. And it's worth getting right on a small scale, because that's key to getting it right on the large.

General Principles

There are a huge number of "principles" that guide API design – I can't begin to list them all exhaustively. Just for a taste, I'll list a few that I'll discuss again in a moment:

The Single Responsibility Principle

that a given unit of software should be responsible for just one thing and not overlap in its responsibilities with other units any more than necessary.

The Principle of Least Astonishment

that a given unit of software should act exactly as expected and not surprise us with unexpected results (direct or otherwise).

The Fail-Fast Principle

that a given unit of software should fail explicitly before producing unexpected results.

The Principle of Loose Coupling

that units of software should be isolated from one another and interact with clarity through few interconnected channels. There shouldn't be a lot of moving parts between them.

In the real world, we often find ourselves weighing one principle against another – often we must simply do the best we can. They're not hard-and-fast rules, but they serve as really useful heuristics at all levels of software design.

An Example: Web Components

A web component may be thought of as an object with its own API. Lots of variations exist, but the abstract concept generally holds. Let's make a simple Riot.js tag.

<counter>
  <div>{ count }</div>
  <button onclick="{ addOne }">+1</button>
  <button onclick="{ subOne }">-1</button>

  <script>
    this.count = 0;
    this.addOne = () => ++this.count;
    this.subOne = () => --this.count;
  </script>
</counter>

We've defined here three primary endpoints to our component's API.

  • count is a number property representing the current count.

  • addOne is a function allowing us to increase the count by one.

  • subOne is a function allowing us to decrease the count by one.

A Change in Requirements

At first glance this seems like a simple and consumable API – but is it robust to change? Imagine that we're now required to communicate the count to and from a server. How do we do this with a minimum of fuss?

Often we'll do this by simply wrapping another API with our own. We might first refactor our view to use a "getter" to the extent it helps us better isolate our internal API from the remote API's changes (embracing the principles of least astonishment and loose coupling).

this.addOne = () => this.setCount(this.getCount() + 1);
this.subOne = () => this.setCount(this.getCount() - 1);
this.getCount = () => apiConnection.getCount();
this.setCount = n => apiConnection.setCount(n);

From the perspective of our users (who we may imagine are the API consumers) our original API is unchanged save for getCount, which is now a method call rather than a property. Had we wrapped it that way in the first place, we wouldn't have been required to alter our markup at all.

A Rabbit Hole: Functions

Functions have an API as well. The API of a function is most immediately described by encompassed by its arguments and its return value, though it may be embellished in other ways. For example, in many languages functions can modify external state, or they may be bound to an execution context (this).

Even without these complicating factors, a function's API can be complex. Let's imagine a couple of pure arithmetical functions defined like so:

const sum = (x, y) => x + y;
const sumAll = (...xs) => xs.reduce(sum, 0);

These functions are called in similar fashion, and under some circumstances return the same thing.

sum(1, 2);    // 3
sumAll(1, 2); // 3

There are key differences, though. sumAll will still return a numerical value when given zero or one argument(s).

sum();    // NaN
sumAll(); // 0

sum(1);    // NaN
sumAll(1); // 1

sumAll will also operate on more than two arguments.

sum(1, 2, 3);    // 3
sumAll(1, 2, 3); // 6

Both functions will operate on non-numerical values.

sum("foo", "bar");    // "foobar"
sumAll("foo", "bar"); // "0foobar"

sum(3, null, x => x);    // 3
sumAll(3, null, x => x); // "3x => x"

Clearly we've got some problems.

  • They don't fail-fast. On being given invalid outputs they return nonsensical (astonishing) results or worse yet, simply wrong results that are still usable. They're likely to create logical errors down the line.

  • sum, being a function that can concatenate strings as well as add numbers, seems to be in violation of the single responsibility principle. In addition, their responsibilities overlap, and may not be sufficiently clarified.

  • The functions are tightly coupled. sumAll's definition depends directly on sum, which we know to be unreliable.

Patching the Leaks

We can address some of these issues by firming up sum's API. We'll introduce some type checking prior to executing the function and throw an exception if it fails.

const sum = (x, y, ...zs) => {
  if (isNaN(x) || isNaN(y) || zs.length > 0) {
    throw new Error(
      "sum requires exactly two numerical arguments."
    );
  }
  return x + y;
};

By throwing errors on receiving any inputs other than two numbers, we've resolved a couple of major problems with sum:

  • It it now more fail-fast than it was – it throws a runtime error rather than producing invalid output. This doesn't help us when we write the code, but it does help us debug incorrect usages and prevent a cascading error that manifests as a harder-to-find bug.

  • It can no longer be used to concatenate strings. We're better obeying the single responsibility principle.

With that done, we'll tune up sumAll to address JavaScript's own surprises (Array.prototype.reduce passes more than two arguments to its callback – somewhat surprising, given the type signature of a typical reduce function.)

const sumAll = (...xs) => xs.reduce((x, y) => sum(x, y), 0);

We didn't resolve the coupling issue with sumAll, but now that sum presents a more consistent and predictable API, the ramifications of tight coupling are of less concern. For now we'll prefer that to duplicating the error checking code.

Overkill? Maybe, but software tends to grow and be used in ways that make the extra effort of really designing and documenting even small functions worthwhile.

Help Along The Way: Types

What if we had tools to help us clarify our APIs? As developers, isn't it our business to find ways to automate problem solving? Why should we have to solve these fundamental problems again and again?

In general, these are the sorts of problems that a strong, expressive type system aims to resolve. Such a system can embed a lot of useful information about an API into the software such that the interpreter can make certain guarantees about the software's behavior.

TypeScript

const sum = (x: number, y: number): number => x + y;
const sumAll = (...xs: number[]): number => xs.reduce(sum, 0);

TypeScript restores much of the original elegance of our functions while still providing safety by use of type annotations. The types of the arguments to sum are required to be number. These arguments aren't optional, and they're the only arguments TypeScript will permit. With this, we can gain confidence that the sum function will continue behaving as expected.

Haskell

sum = (+)
sumAll = foldl (+) 0

Haskell allows us to take things further still without sacrificing the guarantees we've sought for our API. It's able to infer the types of arguments with little or no intervention on our part – and if we violate the assumptions of said inference, it'll provide us with useful error messages (fail-fast).

Here's the same code with optional type annotations intact and explicit passing of arguments.

sum :: Num a => a -> a -> a
sum x y = x + y

sumAll :: (Foldable t, Num b) => t b -> b
sumAll xs = foldl sum 0 xs

There's a lot of interesting information in this code. The type annotations are particularly interesting.

sum ::        -- sum...
  Num a =>    -- ...for any `a` which is numerical...
  a -> a -> a -- ...takes two arguments of
              -- type `a` and returns a value of type `a`.

And sumAll's.

sumAll ::                 -- sumAll...
  (Foldable t, Num b) =>  -- ...for any `t` which is foldable
                          -- and any `b` which is numerical...
  t b -> b                -- ...takes a `t` containing type `b`
                          -- and returns a value of type `b`.

Foldable here is an example of a type class. We're primarily interested in List, which does implement Foldable, but Haskell provides for the use of this function on other data structures with the same guarantees about its behavior (least astonishment).

It still makes me giddy when I get to write code that's concise, elegant, and safe to use all at once.