The epistemology of software quality

Say you run a new team. You have carte blanche to implement any policies you want to make the people more productive and the code less buggy. What do you do?

Careers have been built selling the answer. Take up pair programming! Switch to Haskell! Use UML for everything! These techniques get their own books and conferences. But are they worth the effort? How long till they take effect? Do they even work at all?

¯\_(ツ)_/¯

These questions are important, if not unique to software engineering. How do we tell whether something will solve our problems? We could talk to experts, but experts disagree. We could rely on our own experiences, but those are limited. (Nobody has tried and compared everything.) We could survey people, but “popular” is not the same as “correct.” (Almost half of Americans consider astrology a science.) So how do any of us really know what we know?

Hopefully, as a field matures, scientific study and empirical research eventually replace folklore. Though we’re still in the early days of software engineering (compared to, say, mechanical engineering), few of the technical solutions we’ve studied impact software quality in a meaningful way. Static typing? One study, presented at FSE 2014, found no evidence that static typing is helpful—or harmful. Code standards and linters? Another paper, shared at ICSM 2008, found these can make things worse. Code review? Okay, now that, according to a 2016 article published in Empirical Software Engineering, actually works. But we can’t stake our team’s success on just “more code reviews.”

I strongly believe that technical solutions do help. But we often frame our choices of technical tools and processes as critical decisions, ones that make or break a team. Prominent industry voices may claim that Lisp is “a secret weapon,” that Python users are “unethical,” or that you must use TDD to be “professional.” But if these technical solutions mattered so much, wouldn’t we see that reflected in the research? Instead, we see minor or tentative effects in some situations, and major effects in none. Technical solutions are not a hill worth dying on.

It’s not all bad news. Empirical evidence consistently shows that some factors do make a difference, factors that don’t just dramatically affect code quality—but us. We only have to broaden our idea of what really matters in making software. Instead of technical factors, we need to talk about human ones.

How much sleep do you get per night? When was the last time you worked more than 40 hours a week? Are you happy at your job? These are the questions that most impact software quality. Studies across disciplines consistently show that the difference between technical and human solutions is the difference between results that effectively state “We speculate there is a small impact” and “We are confident there’s a dramatic difference.” These findings make sense: Programming is an extension of our minds, and anything that compromises our minds will hurt our programming skills.

So what is this evidence? Glad you asked. Some of the most clearly documented factors are sleep, hours worked, and stress. Here are just a few of the many, many studies out there.

Sleep

There are two kinds of sleep deprivation. We often think of sleeplessness as acute sleep deprivation (ASD): being awake for 24 hours or more. Most people already know that’s bad. Novice programmers, according to a 2018 study from IEEE Transactions on Software Engineering, lose most of their skills while experiencing ASD, and we can reasonably assume senior developers—a.k.a. other human beings— aren’t immune either. ASD also affects decision-making ability as well as long-term health, according to a 2007 Neuropsychiatric Disease and Treatment article.

More subtle is chronic sleep deprivation (CSD), getting less than enough sleep several nights in a row. The 2007 article demonstrates that CSD reduces mental performance across the board. Worse, it can take several recovery nights, sleeping more than eight hours a night, to completely reverse CSD.

Of course, degraded performance on tests doesn’t necessarily translate to degraded performance at work. That’s why studies have also observed sleep-deprived people in the workplace. The 2008 book Patient Safety and Quality: An Evidence-Based Handbook for Nurses states that sleep-deprived nurses make more serious mistakes, full stop.

To add insult to injury, most sleep-deprived people don’t know they’re performing worse, according to the same 2007 Neuropsychiatric Disease and Treatment article. Individual developers can’t necessarily tell that they’re making more programming mistakes, which makes it harder to self-regulate. That means the dangers are all the more subtle, long-term, and easy to miss.

Hours worked

On a daily basis, we’ve got about eight hours of work in us. Possibly less. According to a 2017 report from the Institute of Labor Economics, call-center employees find their quality of service plummets after about four hours. There are indications that, when we work long hours, our productivity also nose-dives: For example, a 1980 Business Roundtable report found that construction crews working 50 hours a week declined to less than 80 percent productivity after 8 to 10 weeks. In other words, they only accomplished as much each 50-hour week as a well-rested, well-paced team would do in 38 hours. Productivity drops even faster when you do 60-hour weeks, with researchers estimating that, after two months of 60-hour weeks, construction crews will have cumulatively accomplished less than with two months of 40-hour weeks. And, according to a 2004 publication from the Center for Disease Control and Prevention (CDC), all that extra work wrecks your body.

Stress

Finally, consider stress. It’s harder to focus when you’re anxious or angry or distracted. It also stands to reason that you aren’t as productive or as meticulous when you’re stressed out. The National Institute for Occupational Safety and Health, a part of the CDC, has shown that stressed nurses are both significantly less productive and significantly more likely to make serious mistakes. Closer to home, Gamasutra did a 2015 study on crunch mode—which they define as extended overtime—in video game development. They found that games produced in crunch mode not only burned out their development teams but also performed worse on every other aspect measured—critical scores, overall sales, everything—compared to games whose teams cut scope in order to meet deadlines, or opted to extend the production schedule. Meanwhile, a 2014 study published in PeerJ found that happy developers just straight-up solve problems faster.

On the one hand, a few studies indicate that technical solutions like language choice and testing practice may have some impact on an engineering team’s work quality. On the other hand, volumes and volumes of studies show that sleep, workload, and stress have dramatic impacts on performance. So why do blog posts, conference presentations, and software engineering books talk so much about the former and so rarely about the latter? Why aren’t factors like sleep, hours worked, and stress the topics we think of first?

Work-life balance and wellness impact us in a subtler way than technical practices do. It’s easy to point to a bug and say, “This couldn’t have happened in Rust.” It’s a lot harder to point to a bug and say, “This wouldn’t have happened if the programmer wasn’t stressed out and sleep-deprived.” There’s no feedback loop that pushes developers away from too much stress and too little sleep.

In contrast, there are plenty of things that push us toward it. Things like scope creep. Like being understaffed. Like production fires, or last-minute changes, or an upcoming release, or bad bosses, or company culture. Because it’s easier to ask more of people than to address the sickness in the system.

And reducing stress, up front, is expensive. Subsidizing daycare or hiring an extra developer comes out of the budget. But while the long-term expense of a stressful workplace is even greater, it does not appear as a line item in the accounting. As such, companies are incentivized to harm their employees—and eventually suffer for it.

Say you run a new team. You have carte blanche to implement any policies you want to make the team more productive and the code less buggy. What do you do? You could choose a new programming language, or switch everything to microservices, or follow the hottest trend in process. Or you could do things that matter. You could pace schedules. You could ensure that no one works more than eight hours a day. You could let people off 20 minutes early to avoid rush hour. You could make it as easy as possible for parents to take the day off when their kids are sick. You could make people feel like they’re important, because people are important. No method, tool, or language matters nearly as much as our own minds.

Sleep

Hours worked

Stress

About the author

Artwork by

Topics

Buy the print edition

Continue Reading

Teams

Anil Dash

Doing (a little bit of) the impossible

Teams

Lara Hogan

Pay fair

Teams

Paul Ford

Trust the process

Remote

Juan Pablo Buriticá

The future of work is written

Remote

Romello Goodman

Code is sourdough

Planning

Romello Goodman

Planning for pause

Planning

Hillel Wayne

Planning with flare

Testing

Nelson Elhage

Testing as communication

Teams

Kevin Stewart

How to build a startup engineering team

Explore Topics

All Issues

Planning

Mobile

Containers

Reliability

Remote

APIs

Frontend

Software Architecture

Teams

Testing

Open Source

Internationalization

Security

Documentation

Programming Languages

Energy & Environment

Development

Cloud

On-Call