Matt Blodgett: Methodology

Showing posts with label Methodology. Show all posts

Sprints Balance Consistent Delivery With Optimal Value

Value, value, getting value to users. Work on the most valuable thing, always. You could probably summarize the Agile philosophy as "deliver value to end users faster." What's the next most valuable thing we could work on, get it done, and get it out to production.

With a sprint cadence, we commit to a strategy for two weeks, we work as a team through the items in our sprint backlog, and we retrospect at the end to talk about how we did. There's always the chance that some circumstance around the business could change mid-sprint that may change the team's idea about what is most valuable to work on in the moment. Since sprints balance consistent delivery with optimal value, we keep our heads down and finish the chunk of work we agreed to deliver at the start of the sprint.

This requires leaders and engineers to get used to telling people outside the team, "That's a great idea! We'll put it in the product backlog. We can start working on implementing this idea in less than 2 weeks from today!"

As arbitrary as the sprint cadence can seem, I see the sprint as a delivery protector--a built-in reason to say not right now. The items we're working on today may only be 80% of optimal value, but we're going to finish them dammit. In no more than two weeks we can shift our strategy 180 degrees if necessary, but not today.

Sprints Need a Cooldown

Teams vary in the amount of handwringing their leaders do about sprints that aren't perfectly rightsized. What if we bit off too much work at the start of the sprint and can't finish it all by the end? What if we overestimated the work and didn't fit in as many tickets as we could have?

I've worked with teams where having work incomplete on the last day of the sprint was no big deal, it happened regularly, and no one was too concerned about it. And I've worked with teams where the team committing to a certain scope of work was sacred, and tickets carrying over was a grave error that we fretted over and vowed not to repeat.

The teams that prefer the overstuffed sprint are trying to maximize velocity and reduce downtime of the team members. Teams that treat the sprint timebox as sacred are trying to maximize predictability and often striving for a kind of "standardization" across teams. Some teams will even wonder why they're bothering with these stupid sprint things and opt out for a methodology like Kanban where people just grab work items when they finish the previous one, and toss out the timebox.

I have always believed strongly that whether you're using Scrum or Kanban, and working in discrete timeboxes or not, that team members need a regular cooldown. Just as an athlete cannot sprint indefinitely and must rest, I believe the same is true of software engineering teams.

As you might imagine, this puts me in the camp of "fill the sprint conservatively so that we don't go over." This, of course, means that many sprints will end without the theoretical maximum amount of trackable work items packed into them. And to that I say: Good. That's the point.

I believe that implicit in the social contract of teams is this: If you want people to sprint, then they must be allowed to cool down. That's the deal. Continuous sprinting means burnout.

Quoting a previous post of mine, here are a few things engineers need to do that are not represented by tickets in a sprint backlog, and are perfect for filling the cooldown period at the end of a sprint, at their discretion:

Training
Preparing for a presentation
Proof of concept / demo of an intriguing tool, framework, technology
Updating documentation that's been bugging you
Spikes into performance improvements
Cleaning out your inbox
For companies that do some sort of periodic "goal setting" for each employee, let people work on goals from their list

If we set a strong sprint goal, bite off a chunk of work that the team is confident of completing, and we get it all done without rushing a day or two before the end of our sprint, that's a good thing. The team feels a sense of accomplishment, the business gets a valuable increment of working software that fulfills a real need, and the individuals get some time to cool down and shift their attention to some low-intensity stuff that nevertheless needs doing before the next sprint starts again.

For me, the cooldown at the end of the sprint is what sustainable pace is all about. We go hard toward an important goal, and if our planning and teamwork are on point, then we know we'll have that down-shift at the end.

Don't Let an LLM Write Your Unit Tests

Writing unit tests is one of those tasks that software engineers often see as tedious. In fact, I've heard from fellow engineers that unit test generation is one of their favorite uses of LLMs. Let the AI write them! I've always been deeply uncomfortable with this use of LLMs, even though I'm all for reducing the tedious aspects of work in general.

Unit tests are not boilerplate code. If you've read a lot of bad unit tests, then you may get that impression, though. If your only goal in writing unit tests is “Coverage number goes up!” then by all means let an LLM write your unit tests.

But I would argue that writing great unit tests means thinking carefully about what is valuable about the system under test. An LLM is great at generating reams of code in an instant, but it will never know anything about the value of the product your codebase represents.

If our goal instead is “Let’s ensure that our product is retaining its value as we add more code to our codebase.” then a human who understands the value of the product should write the tests.

Unit tests have yet another critical function, which is to provide runnable documentation about the system for other engineers to read. Want to know how some part of the codebase is supposed to work? Go read the unit tests; they represent the valuable features of the software that an engineer wanted to point out.

Code is written once, but read many times. And test coverage is just a number. If you want tests that matter, write them yourself.

Engineering-led Research Tickets

One of my optimistic beliefs about software engineers is that we're full of good ideas about how things around us could be improved. We're smart people, problem solvers. We read tech news, we're friends with engineers at other companies, we bring diverse career experiences to each job--in other words, we're at least vaguely aware of better ways of doing things that are happening in other places.

If we want to go beyond just talking about better ways of doing things, we need to empower engineers to capture their ideas for improvement in the backlog right next to ideas from the product management side. It's critical, of course, that items in the backlog generated by engineers are refined, estimated, and actually included in sprints on a regular basis.

And we want to go beyond the all-too-easy step of adding a one-line ticket at the end of our backlog that says “do cool thing X”. Since no one has the foggiest idea how we’ll actually do that cool thing, the backlog item just sits there and no one ever puts it into a sprint because it’s too nebulous and scary and we don’t know how long it will take or if it’s even possible in a sprint-worth’s amount of time.

Make that “do cool thing X” a research ticket. Time-box the effort. We can put this into a future sprint because at the end of the time box (fitting within one sprint) we will have a documented outcome that gets us closer to the goal than we were before, and we’ll have a much better idea of the actual steps to get there with "real" tickets representing the work.

The output of these research tickets can be a comment saying we shouldn’t do this and here’s why. Or the output could be a series of other tickets the engineer creates with a ticket for each step of the process toward accomplishing the ultimate goal...or a nice intermediate goal.

Even if we end up looking at the list of steps for follow-on work and think, “Geez! That’s a lot of damn work to get the outcome we want!” and we decide it’s not worth the effort, that's still a great outcome.

Who Writes the Backlog?

The Product Owner has the final say in backlog prioritization. But who writes the backlog? Is it always the product owner?

My experience with Agile has been that, in the real world, the backlog contains two types of items:

The "user stories" that discuss functionality from the perspective of an end user
Technical stories--including things like technical debt, maintenance, package upgrades, etc. Things that don't represent user goals that the product owner would be focusing on.

Both types of items are important and need to be worked on.

You can get into trouble in a couple ways:

Every item in the backlog is forced into a "user story" template ("As a...", "I want to...", "So that...").
The product owner is writing backlog items that are of the "technical story" variety since they think they need to write every story in the backlog.

If you have a product owner who has consistent time to write Jira tickets (or whatever issue tracker you have), you're pretty lucky. My experience with actually-existing-Agile is that the Product Owner is typically someone in middle management who is juggling multiple jobs, "product owner" of this product being one more job, and certainly does not have the time to dedicate to backlog management.

Occasionally you get an overactive product owner who tries to capture every task the development team is working on as backlog items, including purely technical tasks that are better written up by engineers or engineering managers/leads.

My opinion is that anyone on the team (engineers, QA, managers, product owner) should be empowered to write backlog items. When the nature of the backlog item is technical, let the technical people write it. When the backlog item is a "user story", the product owner should write the item, or at least delegate the requirements to whoever writes it (probably not an engineer).

Backlogs are just another tool in the tool chest of software engineering. As long as the people doing the work are clear on what the goal of the item is, and the acceptance criteria for its completion, then anyone can write them in whatever format they want.

Scrum Tensions: Code Review

There are tensions in Scrum anywhere you have intra-sprint cycles that must resolve by the end of the sprint. In my last post I wrote about manual quality assurance and the tension that causes in a Scrum setting. Another one of these intra-sprint cycles that most Scrums teams include in their process is code review.

You could, in a way, consider code review to be another form of "QA". In both cases we're initiating a cycle-within-a-sprint with a loop of approval, rejection, re-work, and acceptance.

In the sense that Scrum teams strive to get every sprint backlog item to "done" by the end of the sprint, back-and-forth cycles kill sprints. But they're also essential. Therein lies the tension.

Most Scrum teams conduct estimation sessions prior to starting a sprint. The estimates that a team assigns to product backlog items are typically based on a gut feel of how long it will take or how complicated it will be to "complete" a backlog item. In most teams I've worked with, that estimate is implied to mean how long it will take a software engineer on the team to submit a code review (via a pull request or something similar), i.e., when the engineer thinks their part is "done". Some teams also build into the estimate a duration or complexity estimate based on the work necessary by the QA people.

What estimates don't ever capture, in my experience, is how much effort/time/complexity is required to complete the code review or QA cycles on a backlog item. How can we know ahead of time how many pieces of feedback are going to be left on a pull request? How many of those will necessitate re-work on the item before the end of the sprint? How long will the re-work take? What if the re-work is still not up to snuff? Etc., etc., etc. Each code review starts a cycle of unknown duration.

And all the while everyone who cares about the completion of the sprint is tap-tap-tapping their fingers, nervously wondering if these intra-sprint cycles are going to complete on time.

Even though people mean well, what happens in practice is that thorough code review is tacitly disincentivized. Most product owners are not engineers, and although they understand intuitively that code review, like any form of quality assurance, is necessary, it also adds delay to the immediate "done-ness" of work.

Engineers also know that code reviews are important, but there is always pressure to get to them faster. When bugs crop up later, or the codebase slowly accumulates technical debt, no one in management is going to track down the specific pull request that introduced the problem, read the list of names in the approvers list and assign blame to those individuals. A swift click on the "Approve" button satisfies everyone in the moment.

What do we do about this tension? We can't have code review without introducing an unpredictable amount of delay to each sprint backlog item. You can never fully resolve this tension, in my opinion. But there are some ways to ease the tension.

Be Your Own Reviewer

I'm going to put this one on the engineers. And, no, I'm not suggesting that an engineer who submits a pull request should be clicking "Approve" on their own pull request. What I am suggesting is that before an engineer submits a pull request, it should only be after they have examined their own code to the degree that a reviewer would.

I can't tell you how many times over my career that I've reviewed a pull request where it was clear that the submitter had not even looked at what they were submitting for review. You're not ready to submit a pull request until you've looked line-by-line at the diff you're about to submit and pre-corrected every issue you can find. We're not talking about architectural judgment calls here, we're talking about misspelling method names, pushing blank files, including huge sections of commented-out code from various abandoned local experiments.

It should be rare that a reviewer needs to point out an obvious mistake in a review. I personally feel embarrassed when a reviewer points out something that I know I could have found myself. And I feel doubly embarrassed if it's a mistake they've pointed out multiple times.

Automate, Automate, Automate

Take advantage of every opportunity to automate the tasks associated with code review. We have linters, pre-commit hooks in version control systems, IDE integrations. There are ways to automate basically any tedious, repetitive aspects of code review. If reviewers are repeatedly finding the same kinds of mistakes, it's time to automate the checks for those mistakes, so submitters can't submit them in the first place.

Back-and-forth intra-sprint cycles are what kill sprints. Code review is a cycle, but what can we do to get down to one iteration per cycle as often as possible?

The tension between any kind of quality assurance (including code review) and sprint-based methodologies is impossible to erase, but we have techniques to make it less tense.

Scrum Tensions: Manual QA

After working for many years in the software industry, almost always using a methodology resembling Scrum, there are certain tensions that I've come to believe have no good solution--at least--I've never seen them solved elegantly.

One of those tensions comes from manual quality assurance. When sprint backlog items need to be manually tested by dedicated QA people, there is simply no clean way to do it in my experience.

Here are some of the issues that I've seen repeatedly:

QA does not have time to test items where the development work is completed late in the sprint
QA people are sitting idle for the first few days of the sprint with no completed development work to test yet
If a bug is found, there is no time in the current sprint to fix it

Here are some attempted remedies that I've seen:

Let's build some slack into the sprint for the developers so they can complete their development work with X days to spare at the end to allow time for QA
Let's have the developers "work ahead" of the QA people, as in, the development work that we say is "done" within one sprint is actually tested in the next sprint

And here's where those remedies fall down:

Inevitably, there is pressure to work more items into successive sprints, and development work starts creeping over the false "deadline"
We end each sprint not knowing what is actually "done", and any bugs that are found have probably already been merged into whatever common branch the developers work from

QA people are in a really tough position as they are always sort of passively pressured to not "hold up" the sprint. They know that everyone wants to see a clean sprint where every item is signed off on by QA by the last day. And the double whammy is that they get heat when bugs are found in production.

This is one of those essential tensions in Scrum-based development that I've come to accept is not resolvable. I have never seen manual QA fit neatly into Scrum.

As I see it, there are two options to break the tension:

Do manual QA, and abandon sprint-based methodologies (try Kanban, for example)
Use a sprint-based methodology, and avoid manual QA (some teams fully automate testing)

In my experience, organizations do neither of the above, and the tension continues.

Movin' Tickets

Recently I was re-reading Joel Spolsky's classic blog post The Joel Test: 12 Steps to Better Code. I hadn't read that post in many years. Although a lot of the advice in that post seems almost quaint now, as many of the practices it encourages are ubiquitous and taken for granted in 2024 (Joel wrote that post in 2000), there is one passage that seems just as fresh as ever to me...

...project managers had been so insistent on keeping to the “schedule” that programmers simply rushed through the coding process, writing extremely bad code, because the bug fixing phase was not a part of the formal schedule. There was no attempt to keep the bug-count down. Quite the opposite. The story goes that one programmer, who had to write the code to calculate the height of a line of text, simply wrote “return 12;” and waited for the bug report to come in about how his function is not always correct. The schedule was merely a checklist of features waiting to be turned into bugs.

One of my frustrations with Scrum, or with many teams who say they are "doing Scrum" is the obsession with the Sprint. Teams develop a short-term mindset, where all things begin and end within a two-week period.

We have some sort of "Board" where all the tickets (or PBIs, or cards, or whatever you call them) are shown in one of several different columns each representing a "status" of the ticket. A ticket starts in the far-left column on the first day of the sprint, and by the last day of the sprint, it must be in the last column. That's how we know we had a "good sprint".

Obviously the tickets on the board are just an abstraction representing work. But it's easy to get obsessed with this abstraction. Instead of making our users' lives easier and adding valuable features to the product they use, we're just moving tickets across a virtual board, sprint after sprint.

I always think it's fascinating to hear the language people use to talk about a team's work. In standup, people will say they plan to "have that ticket moved over" today. The team's manager might talk about how "good the board looks" today. In a retrospective meeting at the end of a sprint, the team might talk positively about how quickly tickets were "moving across the board" during that sprint.

The people on the team actually doing the work know they're doing well when they've moved a ticket from one column to a column to the right of that column. This is what they optimize for: efficient ticket-moving.

The necessary work of software engineering that doesn't have a ticket on the board feels downward pressure. A thorough code review for one ticket takes ticket-moving time away from the reviewer. If there are issues to be corrected, then the ticket being reviewed is stalled in its own rightward journey.

QA people on the team are in a difficult position of doing their quality assurance on tickets that are just to the left of the ticket's final destination--the place we all want it to be.

The whole team is incentivized to make sure all the tickets on the board are in the right-most column on the final day of the sprint. As in Joel's anecdote above, bugs found later merely become new tickets to move from left-to-right in a future sprint. Long-term concerns like sound architecture don't have a place on the board.

When a team judges its effectiveness based on the movement of virtual tickets from one status to another, it can lose sight of the big picture. Who is ultimately benefiting from these ticket movements? Why are we moving them exactly? Where do they come from?

I think it's important that teams talk about their work, at least occasionally, without mentioning tickets. What are we accomplishing at a higher level? What are our users saying about our work? How is the business that pays our salaries benefiting from our work?

Surely we're not just movin' tickets.

Vision

Vision is so important in software development. Without the engineers understanding the overall vision, they can't resolve ambiguity in their daily work without consulting someone who holds the vision.

If the engineers don't understand why they're doing any of these things, then they can't fill in the gaps logically. They can't suggest improvements, improvise, or have confidence that they're moving the organization closer to the vision. Teammates talk past each other. One person has more of the vision than another, but doesn't know that. Misunderstandings are common. The track being laid from each end doesn't meet up in the middle.

Every little bit of vision transmission compounds in value. The decisions we make today form the foundation for work that comes later. A misunderstanding in vision today requires re-work tomorrow, a week from now, a month from now.

One of the things that can get left behind in the just-in-time fashion of Agile sprints is that the team can get lost in the weeds. We have to remember that we're building toward a significant milestone of some kind for the business, not just a random sequence of tasks.

It's difficult when backlog items are being entered by one person or a small group of people separate from the engineers and QAs who will be actually building and testing the stuff. They get queued up and drip-fed every two weeks to the broader team. But often there's no shared context transmitted to the whole team about what broad goal we're doing all these tasks for.

Sprint goals are tricky to set. But if a team can't ever seem to define a clear sprint goal, that's almost certainly a sign that the team is lacking a vision.

People like to make fun of heavyweight methodologies like SAFe, but doing a Program Increment Planning event--although tedious--sure does get everyone on the same page with a shared vision for the next few months of work.

Most of us are not working on the Manhattan Project--the end goal should not be obscured from the individuals making the parts. In fact, the end goal should be so clear to everyone that any person on the team could describe it clearly in their own words.

A Standup Free of Should, Probably, and Hopefully

It's interesting to listen to the word choices that people use in the daily standup.

"I should be done with that today."

"I'll probably be done with that today."

"Hopefully I'll be done today."

"I'll try to wrap that up today."

I like to take a mental note of the "shoulds", "probablies", and "hopefullies", and see if the next day that work was truly finished.

There are serial offenders on every team. If a should one morning comes back with another should the next morning, you're really worried.

Why do people feel the need to give these hopeful yet indefinite pseudo-pronouncements about their progress? Is it natural optimism, a people-pleaser temperament, willful deceit?

More importantly, what is the temperature in the room where people feel more inclined to give optimistic projections over more realistic ones? Hopeful wishes over definitive statements?

It could be...

The person does not have a good understanding of the goal of their assigned work, so they don't have a good idea of what it will look like to be "done" with it.
They're operating within an environment where people routinely make weak promises and exaggerations of progress, so that seems like a normal thing to do.
They're operating in a chaotic environment, where it's hard to predict how much focused time they'll get on any given task on a given day.
They know they're not being given enough time or resources to complete work in a politically acceptable timeframe, but it's also not politically acceptable to just say that, so their best option is a hopeful statement about the timeframe in which they intend to deliver it.

But, hey, sometimes people are just inexperienced in the kind of task they've been assigned, and so they can't see the road ahead of them and the milestones along the way. That will absolutely make it hard for them to predict when they'll reach the finish line. Fair enough!

But...I think the "should", "probably", and "hopefully" indicate something else than inexperience. They indicate a foresight about the task ahead and a suspicion that they don't feel comfortable stating.

What is making it hard for people to tell the truth in this environment? We need optimists in software, but too many unchallenged "shoulds", "probablies", and "hopefullies" in the room is a sign of trouble.

How Many Spikes Is Too Many?

Spike is fun to say. Spike! For the unfamiliar, the spike is a concept from Agile methodologies that means a time-boxed backlog item where the end result is learning, rather than delivered software.

Mike Cohn from Mountain Goat Software offers this example:

As an example of a spike, suppose a team is trying to decide between competing design approaches. The product owner may decide to use a spike to invest another 40 (or 4 or 400) hours into the investigation. Or the development team may be making a build vs. buy decision involving a new component. Their Scrum Master might suggest that a good first step toward making that decision would be a spike into the different options available for purchase, their features, and their costs.
Because spikes are time-boxed, the investment is fixed. After the predetermined number of hours, a decision is made. But that decision may be to invest more hours in gaining more knowledge.

I've worked on teams where the process was spike-heavy. We'd commonly have backlog items within most sprints that were dedicated to learning about a topic that we knew would be important for future work. We had features we wanted to get into the software, but we didn't have a good idea of how we were going to accomplish that work on a technical level. For teams that put a big emphasis on accurate estimation and minimal to no carry-over of items at the end of sprints, they want to know that a technical foundation for work is understood before its implementation is "promised" within a particular sprint.

I've also worked on a team where spikes were basically not part of the process at all. Backlog items were oriented around features the product owner wanted in the software, but they wouldn't allot an item into a sprint without a "technical approach" filled out on the item first. The "technical approaches" were usually written ahead of time by team leads or architects that did not have "on the board" responsibilities within sprints and would work on these things ahead of the rest of the team, as time allowed. Sometimes senior engineers would also work on technical approaches for future sprints if they finished their assigned items for a sprint with time to spare.

One of the downsides of a spike-heavy process is that you're reducing the amount of "business value" delivered at the end of a sprint. If your entire sprint was consumed by spikes, I don't think the business would be very happy. On the flipside, learning has to happen somewhere. Whether you call it a spike, refinement, technical approaches, or anything else, the learning must happen.

I wouldn't necessarily say that the spike is an anti-pattern, but if it feels they're being leaned on too heavily, it might be time to stop and ask why they're necessary. Is it because we're shifting decision making to the engineers about requirements that a business analyst or product owner should be making? Are we not dedicating enough time to refinement? Is the product owner spread too thin? Is the development team stacked with junior engineers or lacking in engineers that are experienced with the technology at hand?

Learning has to happen somewhere—that's the nature of software engineering. But an over-reliance on spikes for decision making can indicate deeper organizational issues.

Escaping the Bikeshed

I wrote in 2017 a post called Don't Trap Your Clients in the Bikeshed. That post was about avoiding the trap of seeking feedback on trivial decisions from clients before they're necessary.

Sometimes in software development a group of stakeholders will become very suddenly interested in a particular aspect of the software, and intense debate will ensue with a flurry of changes discussed in an area that everyone has an opinion about. In these occasions, as an engineer, your clients have put you in the bikeshed.

What do you do as an engineer when you find yourself in the bikeshed?

Slow Down

The pressure to immediately address every opinion can be daunting. Try not to get caught up in the whirlwind. Let the group debate while you maintain a healthy distance.

Pause for Documentation

Gently remind people that in order to get features into software, they need to write down clearly what they want. They have a responsibility to be clear about what they're asking for. That's the bargain.

Invoke the Process

QAs need to know what to test. A larger audience needs to know what changes are making it into the software. Customers need to know what's on the horizon. Releases need to be organized. Changes in one part of the codebase impact other areas.

Your engineering team has an established process for making and releasing your software. In the fervor to change the bikeshed, interested parties get excited to see their opinions realized. How long does it take to paint the thing blue? It's clearly not blue right now, and I want it to be blue.

Remind people why the process exists, and the drawbacks of making changes to a software system that don't follow the same process as other changes.

Expose the Bones

Sometimes it's really an issue of transparency. Non-engineers get frustrated that they don't understand why a feature is behaving the way it is. It can help to "expose the bones" as it were. Make a report that anyone can see that shows key metrics (how many requests, how long is it taking, who's using it, etc.). It can help to surface the internals of a feature. Make a page that shows diagnostic information about the input to a feature, how the feature calculated a result, and what the "raw" result is.

Remove Engineering From the Loop

Ultimately the goal is to remove engineering from the debate. If there's an intense debate about the color of the bikeshed, engineering can build a configuration option into the software where non-engineers can change the color at will. Try to distill the points of contention down to self-service features that don't require engineers to make code changes to the system.

Tell Us Why We’re Doing This

I’m constantly shocked that business people don’t make much of an effort to communicate to engineers the impact of their work. It’s par for the course that the engineers don’t know how many users their product has, how many clients they have, how much revenue the product makes, etc. It’s so common in fact that I wonder sometimes if it’s intentional.

There is an Agile concept known as the information radiator. Some companies will put dedicated big screen TVs throughout their offices that show key metrics at a glance. If you have a remote-first culture, a television isn't going to do much good, but a web-based dashboard placed somewhere the whole team goes every day (like your issue tracker) is a great substitute.

These ideas are more about passive awareness, but I think the next step up is active discussion. If you're doing some kind of regular all-hands meetings, like a retrospective, that's a perfect time to pull up that dashboard as a team and discuss it together.

How do we know we did a good job over the last sprint? Yeah, we marked all of our tickets as "Done", but so what? How do we know that our customers are happy with our work? Company leaders always want to know how to get their employees more "engaged" in their work. Show your team how their day-to-day tasks impact real end users. Working a backlog of items sprint after sprint is so far removed from the impact of the work. Why are we doing these things?

How many users did we add over the last two weeks? How many new customers came on board? How many usages of [NEW FEATURE X] did we have?

I feel like I'll be banging this drum for the rest of my career: Engineers are not robots. We want to know the high level goals of our work and how our work impacts real people.

Please, tell us why we're doing this.

Clean Pull Requests

Pull requests can be a pain to review, but there are things submitters can do to make them easier to manage. I have some opinions about what makes a pull request "clean".

Describe What You Did

First off, pull requests typically have some kind of description field that can by default be left blank. The description is super important! Use the description field on the pull request to describe the theme of the PR. Obviously, it's ideal to have it linked to a ticket in your issue tracker.

I'm a fan of opportunistic refactoring, which I'd describe as cleaning up a bit when you're in the codebase doing something unrelated. This can get gnarly, though, in that when it comes time to submit a PR, your PR is intermingling different contexts in a way that is confusing for reviewers. In these cases, you can certainly split the work across two or more different PRs, but that can be a pain if you've already pushed a bunch of intermingled commits to one branch. In those cases, I think it's fine to keep everything in one PR, but it's incumbent upon the submitter to explain where the ticket ends and the "cleaning up" begins.

Every Line Matters

There's nothing worse than going to review someone's PR and it feels like someone dumped a "pile of changes" on you with little or no explanation to why they were made. I might be particularly fussy, but for my own PRs, I won't submit them for review until I've looked at every single line of code that I changed, and ensured that I could explain why I made it if asked. And it's not because I expect to be asked about every line I changed--in fact that would be kind of annoying, but because I often discover something that I didn't mean to change, or I had an idea for a quick refactoring like a variable rename to make the code clearer, but I forgot to do it.

One of my pet peeves as a reviewer is when a PR comes in that touches several files, but some of the files are only changed by a trivial whitespace change--like someone had been working in a file and then backed their changes out manually, resulting in maybe just an extra line break left in the file. And it's not so much that now the change list of files in the PR includes an extra file I have to look at, but because it makes me nervous that the submitter didn't pay close attention to what they had changed in the codebase before submitting the PR. It makes me wonder what other changes they included that they didn't intend to introduce into the production codebase.

It's so easy to leave in fragments of various code experiments from when you were in the phase of figuring out an approach, and then not realizing that those fragments are about to escape your feature branch and enter the official history of our production codebase.

The Clean Pull Request™

If I could sum up what I want from submitters in a Clean Pull Request™ it would be:

Make sure to examine each line changed in each file and ensure it's something you want to include in the production codebase.
Provide context for all changes included, whether that is via a linked issue/ticket that explains the task at hand, or via a written description on the PR itself that provides the context.

Keep it clean, and we'll get through code reviews faster with fewer mistakes.

Lunch & Learns

A lot of companies have a regular practice of "Lunch & Learns"—a series where engineers within the company give a presentation to their coworkers on some technical topic in which they have an interest.

I myself have given presentations at several companies where they did this, and have sat through many presentations by other engineers. Over time, I've developed a taste for what works and what doesn't.

Set the Stage

Start with the basics: Who are you? What do you work on in the company? Why is this topic important to you?

I've watched too many of these where someone starts the presentation by screen-sharing what seems like a random code file in Visual Studio (or their code editor of choice), and just starts talking in minute detail about lines of code in the file. No context at all for: what am I looking at, what is this, why is this significant, what is this application, what does this application do, why would I care, what am I supposed to do with this information, etc.

Go Beyond Generic Content

I've sat through too many "Introduction to X" presentations. They always have generic names like "React.js" or "Test-Driven Development". Now, there's nothing wrong with introductory content, but here's the thing: the audience knows how to use YouTube. If they want a generic introduction on some technical topic, there are undoubtedly free videos available that are more authoritative and higher quality.

If the content of the lunch & learn is completely generic in a way that it could be given to any group at any company without context, then there's really no good reason to be gathering a captive audience at one particular company to watch it.

Take advantage of the unique experience you have with the topic as it pertains to the projects and domain of the internal audience. Make the generic specific.

Make It Practical

What value is your audience going to get out of this presentation? Here are some lunch & learns I like to see:

This Is How We Used X to Do Y in Application Z
How to Replace X in Your Project with Y
Postmortems / Lessons Learned from Project X

One of the best tech leads I've ever worked with once gave me some advice about technical presentations that I've never forgotten. He encouraged me to think about "presentation to Production". It's one thing to give an interesting presentation, or pick an interesting topic, but the next level is to think about "How are we going to put this technique/language/framework/practice into Production?"

Include the Thing

I literally wrote a blog post a couple years ago called Include the Thing about effective written communication for engineers. And it's advice that is totally applicable to presentations.

Look, no one wants to watch a presentation where someone is showing a PowerPoint deck and simply reading the slides back to you. But I think some people, especially engineers, go too far in the other direction and eschew any kind of written outline material for a lunch & learn.

It's great to "show the code", but it's next level to leave a written artifact (PowerPoint deck, Wiki page, etc.) with links to the specific internal code repositories (GitLab, Bitbucket, whatever) and individual files that you were looking at in your local code editor during your screen-share.

For lunch & learns at my current company, we do the trifecta: 1) wiki page including… 2) embedded PowerPoint deck and… 3) embedded screencast video of the presentation. Anyone coming along later who couldn't attend the lunch & learn live, or anyone who wants to refer back to the content has all the context they need.

Let Unsustainable Things Fail

I read a post by Max Countryman recently titled Let It Fail (Hacker News discussion), where he talks about the importance of failure in software engineering organizations. He recounts an experience he had where his boss gave him surprising advice to let a legacy system that was not getting enough attention fail visibly rather than paper over the cracks.

I've written before on this blog about how heroics cover up systemic issues. Organizations become dependent on heroics. Heroics beget more heroics. Heroics are normalized. Eventually the heroes burn out.

That one long-serving, loyal employee who had become a silo of critical knowledge up and quit. That manual, error-prone process we had been using to deploy our software for way too long finally caused an outage. We've been rushing code changes right before big releases, and one finally bit us hard.

It's important to talk openly about unsustainable engineering practices and the failure they portend. But issues of technical debt are hard to understand when placed alongside new feature development and other business priorities. The value of under-the-covers work is hard to understand. Sometimes you need to pull the covers away.

RIP Fred Brooks

Fred Brooks passed away this month. I first wrote about him on this blog in April 2007, right after I had finished reading his seminal book on software engineering, The Mythical Man-Month. I was about 2 years into my career as a software engineer at that point. How time flies!

I thought it might be interesting to pull out my dog-eared copy of the book and reflect on a few of my favorite concepts and how they've borne out in my career 15 years on.

The Mythical Man-Month

First off is the titular concept from the book, in which Brooks's Law is coined.

Adding manpower to a late software project makes it later.

I will say that in my career, I've only seen one project where bodies were hurled at a late project to hasten its completion. Interestingly, that was also the latest project I've ever seen in my career.

Thankfully, my experience has been that Brooks's Law has permeated amongst the managers I've worked with—if not in name, they at least intuitively understood that they couldn't just add more people to a project to get it done faster.

The Second-System Effect

I've seen this play out several times in my career. The idea that we'll replace the hacky system we developed before we knew better with one where we take our time and "get it right" this time.

As he designs the first work, frill after frill and embellishment after embellishment occur to him. These get stored away to be used "next time." Sooner or later the first system is finished, and the architect, with firm confidence and a demonstrated mastery of that class of systems, is ready to build a second system.
This second is the most dangerous system a man ever designs.
The general tendency is to over-design the second system, using all the ideas and frills that were cautiously sidetracked on the first one.

It's easy to forget that just because we have a lot of ideas for improvements that not all of them are worth doing.

No Silver Bullet

I would dare say that most software engineers who have heard of Fred Brooks, found their way to him via his essay No Silver Bullet.

There is no single development, in either technology or management technique, which by itself promises even one order of magnitude improvement in productivity, in reliability, in simplicity.
Not only are there no silver bullets now in view, the very nature of software makes it unlikely that there will be any.

Personally, the message of No Silver Bullet is so ingrained in me as a software engineer, that I don't consciously think about where it came from very often, although it colors my world view constantly.

Every time I see a headline on Hacker News about Framework X vs. Framework Y, or Language A vs. Language B, I understand intuitively how little they can possibly impact the success of a software project.

When I wrote my post Stack Chasers vs. Evergreens in 2018, I don't think I was even aware that I was channeling No Silver Bullet that I had read over a decade earlier. It was in my bones at this point that technological solutions for improving the art of software engineering are inherently, severely limited.

Brooks's treatment of essential complexity vs. accidental complexity had such an impact on me, that I reference it when approaching challenges in my own life that have nothing to do with software engineering or my career at all.

The Mythical Man-Month is holding up amazingly well in 2022. I can't think of any other nearly 50-year-old book written about computer software that is still so relevant and insightful.

Thank you, Fred! R.I.P. And get a copy of The Mythical Man-Month if you haven't already read it.

Swamped to Sustainable

I've written a lot about the concept of sustainability on this blog, and it's something I'm constantly thinking about with regards to software development.

That's why I enjoyed this recent post by Greg Kogan about being "swamped" all the time:

I used to think being swamped was a good sign. I’m doing stuff! I’m making progress! I’m important! I have an excuse to make others wait! …
Now, I’m impressed by people who are not swamped. They prioritize ruthlessly to separate what’s most important from everything else, think deeply about those most-important things, execute them well to make a big impact, do that consistently, and get others around them to do the same. Damn, that’s impressive!
Being swamped isn’t a badge of honor, it’s something to work on.

There were good comments in the discussion of the post on Hacker News, including a link to a manifesto of sorts I wasn't previously familiar with called Sustainable Development.

Sustainable Development is a set of principles for software teams to use to operate in healthy and productive ways, and for the long term.
Software teams practicing Sustainable Development follow guidelines that benefit them in three areas:
Physical: They work in ways that allows them to maintain good physical health.
Emotional: They work in an environment that supports their emotional health.
Cognitive: They work in ways that encourage creativity and support the intellectual nature of software engineering.

I could see this being used as a framing device for discussions in a sprint retrospective.

Development teams wanting to adopt Sustainable Development simply write down the practices they wish to embrace, and then commit to following them. Each practice should benefit the team in at least one of the three areas of sustainability (physical, emotional, or cognitive).

It's often hard getting people to speak productively in retros, and I feel like a good place to start is asking the question to the group: What did we do in this last sprint that doesn't seem sustainable? What can we do to increase the sustainability of our sprints? The suggested sample practices could give the group some ideas on action items to work toward greater sustainability.

I hope every team can agree that swamped is not sustainable.

New People Write the Best Documentation

Have you ever heard that the best way to learn a subject is to teach it?

I've had the experience several times when I've joined a new team and started ramping up, that their documentation is out of date and skips over important information.

We know there's always that awkward transitional phase when a new engineer joins a team that's existed for a while, because they're not immediately ready to take on "real work", even if they're an experienced engineer in general.

This is a perfect opportunity to improve the documentation! In fact, I think a great first assignment for an engineer new to a team is to update the onboarding documentation. They'll see exactly what's missing as they try to follow along, in a way that veterans on the team never will.

When experienced folks are tasked with writing documentation, they don't need the documentation, so they will tend to hand-wave and make leaps in explaining things because they're not truly starting from scratch like a new person is.

Documentation written with a fresh perspective is so helpful, and of course, living documents are best. Establish a virtuous cycle where each new person leaves the docs a little better than they found them.

Let developers do "off the board" work

When sprints are well planned and sustainably paced, it's natural to have developers who have finished all of their committed backlog items with time to spare. In occasions like this, I've seen teams where the Scrum Master—or some other authority figure—reflexively looks for additional items in the backlog to "pull into" the sprint to keep the developers busy. I think this is a mistake, and I'll explain why.

For Agile teams that are organizing their work in sprints, it's traditional to have some sort of visual board or central place that anyone on the team or anyone interested in the team's work can go to see which items are being worked on and the progress of the items. Some people get uncomfortable with the idea of work happening off the board, where it's not officially recognized as work that counts. They want to see each developer with at least one assigned item on the board that is in progress right up until the end of the sprint. That's how we know that we're getting maximum velocity and value from the team, right?

But I think it's important to normalize the idea that people need time to work on things that are "off the board" for a couple of reasons:

People always have work to get done that is not directly related to a team goal.
Continuous sprinting is not sustainable.

Here are a few legitimate things to work on that don't belong on a sprint board:

Training
Preparing for a presentation
Proof of concept / demo of an intriguing tool, framework, technology
Updating documentation that's been bugging you
Spikes into performance improvements
Cleaning out your inbox
For companies that do some sort of periodic "goal setting" for each employee, let people work on goals from their list
In general…things that don't involve QA

I think that last point is important, because people will often talk about downtime being useful for addressing technical debt. But that only works for these end-of-sprint downtime scenarios as long as you're not adding to the regression testing burden of any QA folks that are still testing development work completed during the current sprint. A big technical-debt-reducing refactor is not something we want to check in casually without QA time reserved for making sure it hasn't blown up anything (unless your codebase has truly superb automated regression tests in place).

It is incumbent on the developers to identify their own priorities that they can pull from when they have extra time. Don't ask for more backlog items if you have work in mind that's off the board. For Scrum Masters, the go-to move should not be to pull in more backlog items without asking if the developer has any off-the-board work that can keep them busy.

Apart from recognizing the reality that developers have work to do that isn't directly related to a product backlog, it's important to reflect on the idea of sustainable pace and what "sprinting" really means anyway.

What does the word sprint mean outside the software world? It's a short, high intensity run. In the real world, you don't finish a sprint and then instantly start sprinting again. You have to catch your breath and take a pause before you can sprint again. Otherwise you'll collapse. It's important that software teams do this too—take a breather before the next sprint begins.

Smooth, predictable iterations necessitate a sustainable pace and regular buffers. It's okay if we have time each sprint that is not maximally stuffed with points.

Matt Blodgett