Scrum Tensions: Code Review

There are tensions in Scrum anywhere you have intra-sprint cycles that must resolve by the end of the sprint. In my last post I wrote about manual quality assurance and the tension that causes in a Scrum setting. Another one of these intra-sprint cycles that most Scrums teams include in their process is code review.

You could, in a way, consider code review to be another form of "QA". In both cases we're initiating a cycle-within-a-sprint with a loop of approval, rejection, re-work, and acceptance.

In the sense that Scrum teams strive to get every sprint backlog item to "done" by the end of the sprint, back-and-forth cycles kill sprints. But they're also essential. Therein lies the tension.

Most Scrum teams conduct estimation sessions prior to starting a sprint. The estimates that a team assigns to product backlog items are typically based on a gut feel of how long it will take or how complicated it will be to "complete" a backlog item. In most teams I've worked with, that estimate is implied to mean how long it will take a software engineer on the team to submit a code review (via a pull request or something similar), i.e., when the engineer thinks their part is "done". Some teams also build into the estimate a duration or complexity estimate based on the work necessary by the QA people.

What estimates don't ever capture, in my experience, is how much effort/time/complexity is required to complete the code review or QA cycles on a backlog item. How can we know ahead of time how many pieces of feedback are going to be left on a pull request? How many of those will necessitate re-work on the item before the end of the sprint? How long will the re-work take? What if the re-work is still not up to snuff? Etc., etc., etc. Each code review starts a cycle of unknown duration.

And all the while everyone who cares about the completion of the sprint is tap-tap-tapping their fingers, nervously wondering if these intra-sprint cycles are going to complete on time.

Even though people mean well, what happens in practice is that thorough code review is tacitly disincentivized. Most product owners are not engineers, and although they understand intuitively that code review, like any form of quality assurance, is necessary, it also adds delay to the immediate "done-ness" of work.

Engineers also know that code reviews are important, but there is always pressure to get to them faster. When bugs crop up later, or the codebase slowly accumulates technical debt, no one in management is going to track down the specific pull request that introduced the problem, read the list of names in the approvers list and assign blame to those individuals. A swift click on the "Approve" button satisfies everyone in the moment.

What do we do about this tension? We can't have code review without introducing an unpredictable amount of delay to each sprint backlog item. You can never fully resolve this tension, in my opinion. But there are some ways to ease the tension.

Be Your Own Reviewer

I'm going to put this one on the engineers. And, no, I'm not suggesting that an engineer who submits a pull request should be clicking "Approve" on their own pull request. What I am suggesting is that before an engineer submits a pull request, it should only be after they have examined their own code to the degree that a reviewer would.

I can't tell you how many times over my career that I've reviewed a pull request where it was clear that the submitter had not even looked at what they were submitting for review. You're not ready to submit a pull request until you've looked line-by-line at the diff you're about to submit and pre-corrected every issue you can find. We're not talking about architectural judgment calls here, we're talking about misspelling method names, pushing blank files, including huge sections of commented-out code from various abandoned local experiments.

It should be rare that a reviewer needs to point out an obvious mistake in a review. I personally feel embarrassed when a reviewer points out something that I know I could have found myself. And I feel doubly embarrassed if it's a mistake they've pointed out multiple times.

Automate, Automate, Automate

Take advantage of every opportunity to automate the tasks associated with code review. We have linters, pre-commit hooks in version control systems, IDE integrations. There are ways to automate basically any tedious, repetitive aspects of code review. If reviewers are repeatedly finding the same kinds of mistakes, it's time to automate the checks for those mistakes, so submitters can't submit them in the first place.

Back-and-forth intra-sprint cycles are what kill sprints. Code review is a cycle, but what can we do to get down to one iteration per cycle as often as possible?

The tension between any kind of quality assurance (including code review) and sprint-based methodologies is impossible to erase, but we have techniques to make it less tense.

Scrum Tensions: Manual QA

After working for many years in the software industry, almost always using a methodology resembling Scrum, there are certain tensions that I've come to believe have no good solution--at least--I've never seen them solved elegantly.

One of those tensions comes from manual quality assurance. When sprint backlog items need to be manually tested by dedicated QA people, there is simply no clean way to do it in my experience.

Here are some of the issues that I've seen repeatedly:

  • QA does not have time to test items where the development work is completed late in the sprint
  • QA people are sitting idle for the first few days of the sprint with no completed development work to test yet
  • If a bug is found, there is no time in the current sprint to fix it

Here are some attempted remedies that I've seen:

  • Let's build some slack into the sprint for the developers so they can complete their development work with X days to spare at the end to allow time for QA
  • Let's have the developers "work ahead" of the QA people, as in, the development work that we say is "done" within one sprint is actually tested in the next sprint

And here's where those remedies fall down:

  • Inevitably, there is pressure to work more items into successive sprints, and development work starts creeping over the false "deadline"
  • We end each sprint not knowing what is actually "done", and any bugs that are found have probably already been merged into whatever common branch the developers work from

QA people are in a really tough position as they are always sort of passively pressured to not "hold up" the sprint. They know that everyone wants to see a clean sprint where every item is signed off on by QA by the last day. And the double whammy is that they get heat when bugs are found in production.

This is one of those essential tensions in Scrum-based development that I've come to accept is not resolvable. I have never seen manual QA fit neatly into Scrum. 

As I see it, there are two options to break the tension:

  • Do manual QA, and abandon sprint-based methodologies (try Kanban, for example)
  • Use a sprint-based methodology, and avoid manual QA (some teams fully automate testing)
In my experience, organizations do neither of the above, and the tension continues.

Continuity of Leadership

There's a Kafkaesque situation that develops in companies that cannot retain senior employees. A team feels a sense of urgency about shipping software, but no one can tell them what to build.

Engineers are responsible for making the products that sustain the company, but there's no one around who can say what the products should do exactly.

It's like a movie where the protagonist wakes up every morning with no knowledge of what happened the day before, and tries to piece together his identity from artifacts scattered about.

Big organizational initiatives fail repeatedly because there's no one around from the beginning to the end of the initiative. People working on the initiative find that they can't even remember why the initiative was necessary. The leader who championed the initiative isn't around to take credit for its completion, so there's no one working on the initiative at any given time who cares about its completion.

Cross-team relationships barely exist because whoever is leading one team at the moment doesn't know the leaders of the other teams and has no history with them.

Leaders are continually "backfilling" for other leaders that left, being asked questions that they can't answer.

Morale stays at a low simmer. People show up to work each Monday morning knowing they won’t be productive. Another week where they won’t know if they did good work or not, because no one can define the goal.

An organization without continuity of leadership is an organization with amnesia.

Daring to Care

The most effective technical leader I ever worked with had a track record for coming onto a project and whipping it into shape. His ideas were not groundbreaking. He was not a genius engineer. He was a smart guy, but not necessarily the smartest guy in the room. He wasn't an expert politician, or a charmer. His superpower was that he simply cared more than anyone else around him about the project's success, and he would not back down when implementing improvements. When he noticed an area for improvement, he just did whatever it took to fix it. He would calmly but with absolute persistence run through his argument with whoever he needed to convince that it was the right thing to do. It didn't matter how many people he had to convince, how high up they were, how difficult it was to convince them. He just wouldn't stop if he knew he was right. No improvement was too small or too big. I remember him saying to me once that he didn't understand why people around him would acknowledge obvious problems, but not fix them themselves. What I didn't say, but was thinking silently, was "because no one else here cares as much as you do."

There's risk that comes with caring about something. If I take the initiative to fix the slow build process, then I'm now responsible for it. If I break something, everyone's going to look at me. I might have to talk to the Infrastructure team. Jeez, those guys take so long to get back to you. I'll have to open tickets. I might have to bother people I don't know well to make my task their priority. The build process works now, right? Yeah, it takes longer than it probably needs to, but it works. Do I really care enough to take all this on?

The individuals who rise to the top aren't always the smartest, the most creative, the most charming. They just give a damn. They care more than the people around them. When someone truly cares, they will find no shortage of problems to solve around them. They will find solutions. They will push through objections. They will argue with people. They will take on responsibility for things they don't strictly need to, things no one asked them to take responsibility for.

The open question is: How do you get someone to care? Why do some people care so much more about a project than those around them? Those people are worth their weight in gold.

Movin' Tickets

Recently I was re-reading Joel Spolsky's classic blog post The Joel Test: 12 Steps to Better Code. I hadn't read that post in many years. Although a lot of the advice in that post seems almost quaint now, as many of the practices it encourages are ubiquitous and taken for granted in 2024 (Joel wrote that post in 2000), there is one passage that seems just as fresh as ever to me...

...project managers had been so insistent on keeping to the “schedule” that programmers simply rushed through the coding process, writing extremely bad code, because the bug fixing phase was not a part of the formal schedule. There was no attempt to keep the bug-count down. Quite the opposite. The story goes that one programmer, who had to write the code to calculate the height of a line of text, simply wrote “return 12;” and waited for the bug report to come in about how his function is not always correct. The schedule was merely a checklist of features waiting to be turned into bugs.

One of my frustrations with Scrum, or with many teams who say they are "doing Scrum" is the obsession with the Sprint. Teams develop a short-term mindset, where all things begin and end within a two-week period.

We have some sort of "Board" where all the tickets (or PBIs, or cards, or whatever you call them) are shown in one of several different columns each representing a "status" of the ticket. A ticket starts in the far-left column on the first day of the sprint, and by the last day of the sprint, it must be in the last column. That's how we know we had a "good sprint".

Obviously the tickets on the board are just an abstraction representing work. But it's easy to get obsessed with this abstraction. Instead of making our users' lives easier and adding valuable features to the product they use, we're just moving tickets across a virtual board, sprint after sprint.

I always think it's fascinating to hear the language people use to talk about a team's work. In standup, people will say they plan to "have that ticket moved over" today. The team's manager might talk about how "good the board looks" today. In a retrospective meeting at the end of a sprint, the team might talk positively about how quickly tickets were "moving across the board" during that sprint.

The people on the team actually doing the work know they're doing well when they've moved a ticket from one column to a column to the right of that column. This is what they optimize for: efficient ticket-moving.

The necessary work of software engineering that doesn't have a ticket on the board feels downward pressure. A thorough code review for one ticket takes ticket-moving time away from the reviewer. If there are issues to be corrected, then the ticket being reviewed is stalled in its own rightward journey.

QA people on the team are in a difficult position of doing their quality assurance on tickets that are just to the left of the ticket's final destination--the place we all want it to be.

The whole team is incentivized to make sure all the tickets on the board are in the right-most column on the final day of the sprint. As in Joel's anecdote above, bugs found later merely become new tickets to move from left-to-right in a future sprint. Long-term concerns like sound architecture don't have a place on the board. 

When a team judges its effectiveness based on the movement of virtual tickets from one status to another, it can lose sight of the big picture. Who is ultimately benefiting from these ticket movements? Why are we moving them exactly? Where do they come from?

I think it's important that teams talk about their work, at least occasionally, without mentioning tickets. What are we accomplishing at a higher level? What are our users saying about our work? How is the business that pays our salaries benefiting from our work?

Surely we're not just movin' tickets.

Vision

Vision is so important in software development. Without the engineers understanding the overall vision, they can't resolve ambiguity in their daily work without consulting someone who holds the vision.

If the engineers don't understand why they're doing any of these things, then they can't fill in the gaps logically. They can't suggest improvements, improvise, or have confidence that they're moving the organization closer to the vision. Teammates talk past each other. One person has more of the vision than another, but doesn't know that. Misunderstandings are common. The track being laid from each end doesn't meet up in the middle.

Every little bit of vision transmission compounds in value. The decisions we make today form the foundation for work that comes later. A misunderstanding in vision today requires re-work tomorrow, a week from now, a month from now.

One of the things that can get left behind in the just-in-time fashion of Agile sprints is that the team can get lost in the weeds. We have to remember that we're building toward a significant milestone of some kind for the business, not just a random sequence of tasks.

It's difficult when backlog items are being entered by one person or a small group of people separate from the engineers and QAs who will be actually building and testing the stuff. They get queued up and drip-fed every two weeks to the broader team. But often there's no shared context transmitted to the whole team about what broad goal we're doing all these tasks for.

Sprint goals are tricky to set. But if a team can't ever seem to define a clear sprint goal, that's almost certainly a sign that the team is lacking a vision.

People like to make fun of heavyweight methodologies like SAFe, but doing a Program Increment Planning event--although tedious--sure does get everyone on the same page with a shared vision for the next few months of work.

Most of us are not working on the Manhattan Project--the end goal should not be obscured from the individuals making the parts. In fact, the end goal should be so clear to everyone that any person on the team could describe it clearly in their own words.

A Standup Free of Should, Probably, and Hopefully

It's interesting to listen to the word choices that people use in the daily standup.

"I should be done with that today."

"I'll probably be done with that today."

"Hopefully I'll be done today."

"I'll try to wrap that up today."

I like to take a mental note of the "shoulds", "probablies", and "hopefullies", and see if the next day that work was truly finished.

There are serial offenders on every team. If a should one morning comes back with another should the next morning, you're really worried.

Why do people feel the need to give these hopeful yet indefinite pseudo-pronouncements about their progress? Is it natural optimism, a people-pleaser temperament, willful deceit?

More importantly, what is the temperature in the room where people feel more inclined to give optimistic projections over more realistic ones? Hopeful wishes over definitive statements?

It could be...

  • The person does not have a good understanding of the goal of their assigned work, so they don't have a good idea of what it will look like to be "done" with it.
  • They're operating within an environment where people routinely make weak promises and exaggerations of progress, so that seems like a normal thing to do.
  • They're operating in a chaotic environment, where it's hard to predict how much focused time they'll get on any given task on a given day.
  • They know they're not being given enough time or resources to complete work in a politically acceptable timeframe, but it's also not politically acceptable to just say that, so their best option is a hopeful statement about the timeframe in which they intend to deliver it.

But, hey, sometimes people are just inexperienced in the kind of task they've been assigned, and so they can't see the road ahead of them and the milestones along the way. That will absolutely make it hard for them to predict when they'll reach the finish line. Fair enough!

But...I think the "should", "probably", and "hopefully" indicate something else than inexperience. They indicate a foresight about the task ahead and a suspicion that they don't feel comfortable stating.

What is making it hard for people to tell the truth in this environment? We need optimists in software, but too many unchallenged "shoulds", "probablies", and "hopefullies" in the room is a sign of trouble.

How Many Spikes Is Too Many?

Spike is fun to say. Spike! For the unfamiliar, the spike is a concept from Agile methodologies that means a time-boxed backlog item where the end result is learning, rather than delivered software.

Mike Cohn from Mountain Goat Software offers this example:

As an example of a spike, suppose a team is trying to decide between competing design approaches. The product owner may decide to use a spike to invest another 40 (or 4 or 400) hours into the investigation. Or the development team may be making a build vs. buy decision involving a new component. Their Scrum Master might suggest that a good first step toward making that decision would be a spike into the different options available for purchase, their features, and their costs.

Because spikes are time-boxed, the investment is fixed. After the predetermined number of hours, a decision is made. But that decision may be to invest more hours in gaining more knowledge.

I've worked on teams where the process was spike-heavy. We'd commonly have backlog items within most sprints that were dedicated to learning about a topic that we knew would be important for future work. We had features we wanted to get into the software, but we didn't have a good idea of how we were going to accomplish that work on a technical level. For teams that put a big emphasis on accurate estimation and minimal to no carry-over of items at the end of sprints, they want to know that a technical foundation for work is understood before its implementation is "promised" within a particular sprint.

I've also worked on a team where spikes were basically not part of the process at all. Backlog items were oriented around features the product owner wanted in the software, but they wouldn't allot an item into a sprint without a "technical approach" filled out on the item first. The "technical approaches" were usually written ahead of time by team leads or architects that did not have "on the board" responsibilities within sprints and would work on these things ahead of the rest of the team, as time allowed. Sometimes senior engineers would also work on technical approaches for future sprints if they finished their assigned items for a sprint with time to spare.

One of the downsides of a spike-heavy process is that you're reducing the amount of "business value" delivered at the end of a sprint. If your entire sprint was consumed by spikes, I don't think the business would be very happy. On the flipside, learning has to happen somewhere. Whether you call it a spike, refinement, technical approaches, or anything else, the learning must happen.

I wouldn't necessarily say that the spike is an anti-pattern, but if it feels they're being leaned on too heavily, it might be time to stop and ask why they're necessary. Is it because we're shifting decision making to the engineers about requirements that a business analyst or product owner should be making? Are we not dedicating enough time to refinement? Is the product owner spread too thin? Is the development team stacked with junior engineers or lacking in engineers that are experienced with the technology at hand?

Learning has to happen somewhere—that's the nature of software engineering. But an over-reliance on spikes for decision making can indicate deeper organizational issues.