The Elements of Code

Recipe

Programs must be written for people to read, and only incidentally for machines to execute.

Hal Abelson and Gerald Sussman, Structure and Interpretation of Computer Programs

Rule: Structure Code Sequentially

Programs should read like a recipe. Start at the top, and line by line work your way down. Error conditions should be indented to indicate they are obviously error cases.

Patterns and Instructions

Good computer programs follow the same pattern that recipes have used since the 18th century: list ingredients at the top, followed by step-by-step instructions on how to interact with those ingredients to prepare the final product. Sections for related steps are broken into paragraphs.

Recipes are organized this way to improve readability and comprehension. Before they were written according to a standard, recipes would often skip steps and intermix ingredient quantities with the usage of those ingredients. The lack of formalism meant recipes were rarely used except by professional chefs who already had the context necessary to interpret the steps.

This is similar to where we find ourselves as programmers. Many projects and code bases are structured inconsistently, with object construction embedded in application logic (see the Chapter 4, “New”), run-on functions, and complex conditional logic. These projects are understandable to their creators and those who have spent significant time acquiring the necessary context, but not to anyone else.

We can avoid this by following the lessons learned by recipe-makers who needed to create instructions that could be readily understood by everyone.

Bake An Apple Pie

The standard recipe format has two sections: ingredients, and directions.

Ingredients are analogous to the “building” (construction graph, see Chapter 4, “New”) part of a program. They are the dependencies, the prerequisites that must be met before the program can be executed.

Directions are the “doing” part of a program, and are responsible for the primary work that results in the important output and behavior.

To examine this approach for creating clear instructions, let’s take a look at a recipe to bake an apple pie:

Ingredients:

1 pastry for a 9 inch double crust pie
½ cup unsalted butter
3 tablespoons all-purpose flour
¼ cup water
½ cup white sugar
½ cup packed brown sugar

8 Granny Smith apples - peeled, cored and sliced

Directions:

Step 1
Preheat oven to 425 degrees F (220 degrees C).
Melt the butter in a saucepan. Stir in flour to form a paste.
Add water, white sugar and brown sugar, and bring to a boil.
Reduce temperature and let simmer.

Step 2
Place the bottom crust in your pan.
Fill with apples, mounded slightly.
Cover with a lattice work crust.
Gently pour the sugar and butter liquid over the crust.
Pour slowly so that it does not run off.

Step 3
Bake 15 minutes in the preheated oven.
Reduce the temperature to 350 degrees F (175 degrees C).
Continue baking for 35 to 45 minutes, until apples are soft.

Since a program is executed by computers, it must be specific and detailed, but the idea is essentially the same as a recipe. Writing the apple pie recipe as code would look something like this:

import appliances
import pantry
import recipes
import time
import utensils

def main():
    kitchen_range = appliances.Range()
    kitchen_range.oven.preheat(425)

    saucepan = utensils.Saucepan()

    ingredients = pantry.fetch_ingredients(
        "butter", "flour", "water",
        "white_sugar", "brown_sugar", "apples"
    )

    fridge = appliances.Refrigerator()
    rolling_pin = utensils.RollingPin()

    crust_dough = prepare_dough(
        recipes.Dough(
            ingredients['butter'], 
            ingredients['flour'], 
            ingredients['water'],
        ),
        fridge,
        rolling_pin
    )
    lattice_dough = prepare_dough(
        recipes.Dough(
            ingredients['butter'], 
            ingredients['flour'], 
            ingredients['water'],
        ),
        fridge,
        rolling_pin
    )

    cook_filling(
        saucepan,
        kitchen_range.front_right_burner,
        ingredients['butter'],
        ingredients['flour'],
        ingredients['water'],
        ingredients['white_sugar'],
        ingredients['brown_sugar']
    )

    pie_pan = utensils.PiePan()

    unbaked_pie = fill_pie(
        pie_pan
        crust_dough,
        lattice_dough,
        ingredients['apples'],
        saucepan
    )

    pie = bake(kitchen_range, pie_pan)
    return pie

def prepare_dough(dough, fridge, rolling_pin):
    fridge.chill(dough)
    rolling_pin.roll(dough)
    return dough

def cook_filling(
    saucepan, burner, butter, flour, 
    water, white_sugar, brown_sugar
):
    burner.temperature("medium")

    saucepan.add(butter)
    saucepan.wait_until("melted")
    saucepan.add(flour)
    saucepan.stir_until("paste")
    saucepan.add(water, white_sugar, brown_sugar)

    burner.temperature("high")
    saucepan.wait_until("boil")

    burner.temperature("low")
    return saucepan


def fill_pie(pie_pan, crust, lattice, apples, saucepan):
    pie_pan.add(crust)
    pie_pan.add(apples)
    pie_pan.add(saucepan.pour())
    pie_pan.add(lattice)

    return pie_pan

def bake(oven, pie_pan):
    oven.bake(pie_pan, 15)

    oven.set_temp(350)

    oven.bake(pie_pan, 35)

    for i in range(10):
        if pie_pan.is_baked():
            return pie
        time.sleep(60)
    else:
        raise Exception("Pie not baked after 60m in oven!")

The code reads as a sequence of clear, linear steps that must be taken to accomplish the task. Rather than a single function, it uses multiple functions acting like paragraphs.

Note the use of whitespace and blank lines. Having blank lines between sections allows the reader room to focus on those sections, and indicates which behaviors are conceptually related.

Program Structure

First, programs should list the “ingredients” necessary to accomplish the task.

This is the “building” phase, which we will cover in chapter 4, “New”.

def main():
    a = A()
    b = B()
    c = C()

Next, a high-level function should be created that runs the primary steps. This is similar to the way directions are broken down into “Step 1”, “Step 2”, etc.

def run(a, b, c):
    step1(a)
    step2(a, b)
    step3(b, c)

Finally, each “step” function is similar to a “paragraph”. It is a short sequence of instructions necessary to accomplish a sub-goal within the overall task.

def step1(a):
    translate(a)
    rotate(a)
    a.format()

Often, those step functions need their own steps, and should be broken down further. A good rule of thumb is to split nested elements into their own function. In example below we have a few such elements:

def step1(a):
    for e in a:
        sub1(e)
        sub2(e)
        sub3(e)
        for i in e.items():
            sub1(i)
            sub4(i)

Applying our approach, this becomes:

def step1(a):
    for e in a:
        process_e(e)

def process_e(e):
    sub1(e)
    sub2(e)
    sub3(e)
    for i in e.items():
        sub1(i)
        sub4(i)

As code becomes more indented, it becomes more difficult to reason about the layers of indentation and their interactions, increasing the program’s MTC.

Let's Get Technical

Indentation refers to all nested code blocks. In many languages, indentation can be avoided through the use of semicolons to end statements and odd formatting. This does not make the code any less confusing.

Breaking those indented code blocks out into their own functions can clarify what is actually happening within the blocks, and what variables are used by which blocks. This is not about the length of the function, and there are no targets with regards to the length. This is about the impact that indentation has on MTC, and ways to keep that impact low. If putting indented blocks into their own functions results in many tiny functions, it may increases MTC. In such cases, it is better to leave those blocks in a single longer function.

Nested Function Calls

There is a common pattern in software where some functionality must be chained together to accomplish a task. This can result in nested function calls:

def handle_user_name_update(user_id, name):
    publish(update_name(get_user_by_id(user_id), name))

Let's Get Technical

Functional programming languages often have a "pipe" operator to chain function calls without needing to nest them. This maintains readability without creating additional variables.

In the above example, we retrieve a user, update the name property, and then publish the result, all on a single line.

We read from left to right, but in nested function calls, the leftmost call is the last one, not the first, which is confusing. Even separating which values are associated with which function is cumbersome. The result is code that is difficult to reason about, and increases MTC.

Instead, split the calls into multiple lines:

def handle_user_name_update(user_id, name):
    user = get_user_by_id(user_id)
    result = update_name(user, name)
    publish(result)

By performing every operation on its own line, it takes less time to understand the order of function execution and how the parameters are associated. One level of nesting is usually acceptable, but it is situational. In general, avoid nested function calls.

Error Cases

A useful pattern, popularized by idiomatic GoLang, is to left-align the happy-path code, and indent the error path code.

Visually, this results in code where error cases are trivial to spot, and can be understood rapidly.

However, this is rarely our train of thought when programming. Often, we think about problems in terms of what we can do, and then our code follows those thought patterns. For example, without using the indented error principle:

def process(event):
    if is_valid(event):
        log.info(event)
        event.execute()
    else:
        raise InvalidEvent(event)

We initially check to see if we can do a thing, and then we proceed to do it if the check was successful. The error case is listed at the bottom of the function as an afterthought.

If we follow the indented error principle, we would first check the error condition at the top of the function, and thus ensure the error case is indented and the happy-path code is left-aligned.

def process(event):
    if not is_valid(event):
        raise InvalidEvent(event)

    log.info(event)
    event.execute()

This model scales wonderfully, particularly in cases where there are multiple error conditions throughout a function. A function can be rapidly scanned, and all the error conditions identified. If there is a known bug, it is often much simpler to spot it in functions that follow this pattern, since the misbehaving case can quickly be located.

Conclusion

These structural rules result in consistent code that can be easily understood and maintained.

In fact, because comprehension is enhanced to such a degree when following this structure, bugs can often be identified simply by reading the code, without any execution required by a computer. Given the general complexity of most programs these days, that in itself is an astounding benefit.

Keep in mind that the goal is readability. If splitting functions increases MTC and increases complexity, consider the following ways to make the code still read like a recipe:

  1. Check for unnecessary nesting. Perhaps there are ways to rewrite the function to reduce the indentation, such as following the indented error principle.
  2. See if the arguments are related enough that they could be bundled into clearer data structures.
  3. Follow the guidance in the Chapter 6, “If”, to reconstruct the program with a more linear flow.