Testing the boundaries of collaboration

It’s 2030. A programmer in Lagos extracts a helper method. Seconds later, the code of every developer working on the program around the world updates to reflect the change. Seconds later, each of the thousands of servers running the software updates. Seconds later, the device in my pocket in Berlin updates, along with hundreds of millions of other devices across the globe.

Perhaps the most absurd assumption in this story is that I’ll still have a pocket in 10 years. (Or that I’ll “carry” a “device.”) But the instant deployment of tiny changes from thousands of developers is just the continuation of decades-old trends in software development collaboration.

Welcome to collaborative software development, Limbo-style: a future of software development collaboration. (I’m not arrogant enough to call it the future—I’ve seen too much by now to assume that I know where all this is going.)

The hidden cost of code review

It’s 2016. I’m in my fifth year at Facebook coaching programmers. I’ve seen the engineering organization grow from 700 to 5,000 developers. Scale is the hard part about Facebook. We took a leading role in Mercurial development just so that we could be sure it could handle all our code in one repository.

I want to make an impact, but I know that I’ll have trouble doing so if I do what everyone else is doing. Plenty of people in the organization are looking at how to get from 5,000 to 10,000 developers. However, I can find no one trying to figure out how to handle 100,000 developers working on the same system. (Protip: When looking for juicy problems, follow an established trend further than any sensible person would.) Ideas need time to ripen, so I stick the 100,000-developer problem in the back of my mind and get back to coaching engineers.

While coaching, I notice just how much time students lose because of the latency and variance created by our blocking, asynchronous code review workflow. In what becomes the industry-standard collaboration workflow, every change to Facebook code requires an independent reviewer to approve it before it can go into production.

Some students juggle several projects at once so that they can continue coding while waiting for a review. Others stack up several changes, betting that the first one won’t have to be revised in ways that will ripple forward. Others just pack more and more into a change to amortize the cost of review delay.

Everyone agrees that small changes are the way to go, but the overhead per change forces programmers into a trade-off. Make the changes too small and you spend all your time waiting for code reviewers. Make the changes too big and you also spend all your time waiting for someone to review and confidently endorse a gigantic change. This phenomenon isn’t scrutinized: It’s just the price we pay for working together.

But my engineer nose twitches, as it does when a problem is about to go nonlinear. The bigger the organization, it seems to me, the bigger the hidden costs of blocking, asynchronous code review.

How low can Limbo go

It’s 2017. I’m still coaching at Facebook when two experiences pull the question of scaling collaborative workflows out of the dusty back rooms of my brain.

First, I measure that the distribution of diff sizes is the same regardless of programming language (confounding my hypothesis). The only outlier is one service where each change is rolled out as a separate deployment. (Changes are batched in the deployment of other services.) Those changes tend to be smaller. Small is good, remember? Hmm . . .

Second, I run a personal project, a Smalltalk virtual machine, where I keep changes tiny. I notice that almost all of the changes are safe. Faced with a hard behavioral change, I spend significant effort changing the structure in small, safe steps, so that the hard change becomes easy. If most changes are safe, then they can be deployed immediately, saving the heavy validation horsepower for a small percentage of changes.

From Wikipedia: The universe of the Game of Life is an infinite, two-dimensional orthogonal grid of square cells, each of which is in one of two possible states: alive or dead (or populated and unpopulated, respectively). Every cell interacts with its eight neighbors, which are the cells that are horizontally, vertically, or diagonally adjacent. At each step in time, the following transitions occur:

Any live cell with fewer than two live neighbors dies, as if by underpopulation.
Any live cell with two or three live neighbors lives on to the next generation.
Any live cell with more than three live neighbors dies, as if by overpopulation.
Any dead cell with exactly three live neighbors becomes a live cell, as if by reproduction.

As I program my Smalltalk project, I begin to have literal visions of software development collaboration as Conway’s Game of Life, the cell automaton created by the mathematician John Horton Conway. In my version, each cell is a programmer. When a programmer makes a change, their cell changes color. Changes ripple outward to other developers, sometimes crashing and interfering but usually just going off the edge of the map.

Madness. Chaos warning. Changes propagating, joining, merging, crashing. How could a programmer work in this environment?

As I live with the discomfort of this model, I realize that I haven’t gone far enough. What if, instead, not just programmers but all production machines are on the game board? (And I thought I was outrageous before!)

I tell my friend Saurav Mohapatra, a comics author and software engineer, about this model. The smaller the changes, the better the whole scheme seems to work. He suggests the name “Limbo” because, as the song asks, how low can you go? That is, how small a change is too small? I don’t have an answer, but at this stage of the idea, that’s okay.

Limbo is clearly impractical: a model where all changes are tiny—and instantly deployed. But “impractical” is a compliment when applied to a new idea. Revolutionary innovation comes of making the impractical practical. (Practical work is also necessary for evolution, but it’s not really my bag.)

Chewing on this idea calls forth related and supporting ideas. (As writer Basil King puts it, “Go at it boldly, and you’ll find unexpected forces closing round and coming to your aid.”) I have grand ideas for rebuilding the programming toolchain based on abstract syntax tree transformations. I even build an editor based on it, called Prune, with fellow Facebook software engineer Thiago Hirai.

Rewriting editors and indexers and version control and build tools and deployment tools is a lot of work, though. If I want to see Limbo in my lifetime, I need to get lucky.

Test, commit, revert

It’s 2018. My longtime collaborators at Iterate, an innovation consultancy in Oslo, Norway, invite me to host a code camp that November: a week of nothing but coding, and a way for geeks to stretch their legs. As we sip coffee in our borrowed conference room on the first Monday of camp, I explain my vision for Limbo. Then I show the students (Lars Barlindhaug, Oddmund Strømme, and Ole Johannessen) a workflow that I’ve been using for years. Every time the tests pass, I create a little commit so that I can easily get back to a known good state.

Strømme, one of the few programmers I’ve ever met who is as obsessed with symmetry as I am, says, “If we commit when the tests pass, then we must revert when the tests fail.”

I hate this idea. “You mean if I make one little mistake, all of my changes would just—*poof*— disappear?” Yes, I hate this idea. It’s also cheap to try, so we do. There’s a particular shiver I get when encountering a bad idea so bad that it might be good. I feel that shiver now.

Before we start our coding project, we implement our new workflow, which we call test && commit || revert, or TCR:

python sample.py && git commit -am "working" || git reset --hard

Then we code a sample project: a testing framework. Every time we make a mistake, as expected—*poof*—our changes disappear. At first, the disappearing code startles us. Then we notice our programming style changing.

Initially, we just make the same change again, but the computer eventually out-stubborns us. Then, if we’ve been making changes and we aren’t sure if we’ve broken anything, we just run the command line. If we disagree with the computer about whether we’ve made a mistake, we figure out how to make the change in several smaller steps.

We know that we’re on to something with this new workflow. Despite its simplicity and similarity to test-driven development, it creates intense incentives for making changes in small steps.

With a couple days’ practice, we gain confidence in our shiny new TCR skills. TCR incentivizes us to create an endless stream of tiny changes, each of which results in working software. Remind you of anything? My idea for Limbo also relies on an endless stream of tiny changes, each of which results in working software. Does this mean that we’re ready to Limbo?

Turns out (the most exciting words in engineering): Yes. We create a repository on GitHub, clone it to two machines, then execute a loop on each machine:

while(true); 
do 
    git pull --rebase; 
    git push; 
done;

We start working on our sample program separately, as two pairs. In spite of deliberately avoiding any explicit coordination, and in spite of the tiny codebase, we find merge conflicts rare and cheap. Every once in a while, we’re writing some code and—*poof*—it disappears, to be replaced by whatever the other pair has changed. We only lose a few seconds of work, so it never feels like an imposition; instead, it feels more like an update that we appreciate seeing before we get any further.

The incentive to make changes in tiny steps (built into TCR) is amplified in Limbo-style collaboration. The first pair to finish and commit doesn’t risk having their code poofed. And if you don’t want your changes to disappear because of someone else’s activity, make your changes in even smaller steps.

The disaster that wasn’t

It’s 2019. Blocking, asynchronous code reviews are the dominant method for collaboratively developing software. If I draw one certain conclusion from my experiments with TCR and Limbo, it’s that blocking, asynchronous code reviews are not the only effective workflow for collaboration. While I don’t know if TCR and/or Limbo will be the future, I think that something different is coming.

A handful of bloggers and screencasters have replicated TCR in various languages and programming environments. Usage, so far, is confined to wild-eyed pioneers, but momentum is gathering. If you’re the sort of person who likes to try out new programming workflows for fun, TCR awaits you. If you like smooth, professional tool support, then you’ll need to write your own.

As of this writing, Limbo has only been tried to the degree I’ve described here. (Please let me know if you’ve tried it with more independent streams of changes!) It should have been horrible and a disaster—and it wasn’t. I get more excited by ideas that should be disasters and aren’t than by ideas that should work and do. But that’s me.

If you want to contribute to the real-time software development collaboration wave, here’s where to begin, with some of the reasons why Limbo is impractical:

Instability. Constant changes mean breaking all the time. Right?
Bandwidth. Constant changes mean too much bandwidth consumed distributing changes. Right?
Security. Constant changes mean bad actors slipping in bad changes. Right?
Latency. Constant changes mean spending forever waiting for feedback. Right?
Tools. Constant changes mean completely new toolchains. Right?
Training. Constant changes mean new design, testing, tooling, and collaboration skills. Right?
Culture. Constant changes mean new social structures of programming, and programmers aren’t social. Right?

Transform the impracticalities into practicalities and you’ll really have something special. Even make a little progress and you’ll have progress worth the cost.

That program I’m using in 2030? It has been written by 100,000 programmers around the world. Some programmers came to it as experts and contributed immediately. Others used their work on it to fuel learning. Along the way, new social and economic structures evolved to support continuing work. One thing, though, was constant—tiny changes, instantly deployed.

The hidden cost of code review

How low can Limbo go

Test, commit, revert

The disaster that wasn’t

About the author

Artwork by

Topics

Buy the print edition

Continue Reading

Testing

Charity Majors

I test in prod

Development

Dan Abramov

The melting pot of JavaScript

Testing

Ipsita Agarwal

A test of meaning

Testing

Tammy Butow

Tests from the crypt

Testing

David MacIver

In praise of property-based testing

Testing

Keyur Govande

Ask an expert: What’s the value of transparency in testing and deployment?

Testing

Increment Staff

Testing at scale

Testing

Increment Staff

The QA Q&A

Testing

Myra Awodey and Karin Tsai

The process: Launching Duolingo’s Arabic language course

Explore Topics

All Issues

Planning

Mobile

Containers

Reliability

Remote

APIs

Frontend

Software Architecture

Teams

Testing

Open Source

Internationalization

Security

Documentation

Programming Languages

Energy & Environment

Development

Cloud

On-Call