Show all posts
4 years ago

How We Handle Bug Fixes and Rework

Today I want to tell more about our *relationship* with bugs, fixes and smaller work. The way a software development company prioritizes their backlog depends on a number of things, and I will give an example of how we've explored various solutions to address the changing priorities. There's never one single "How to prioritize backlog" recipe, and I want to caution everyone against blindly following a practice that worked for someone else. In each particular case, everything depends on the current context in which company finds itself at any given moment: how many developers are there in the company, how many customers, which strategic priorities does a company have. In short, smart pragmatism is the only universal tool that would help tackle a specific business challenge, not a ready-made "how to".

The Product Backlog: New Things and Fixes

Speaking from our experience,  product backlog usually includes the following 2 groups of work items:

-new features (or some fundamental re-work for the existing features) which require at least 3-4 months work for 5-6 developers and QAs

-fixes and reworks to existing product features (based on feedback from leads/customers, and from the team)

In Targetprocess' 10 year history  there was a time when one product owner prioritized backlog both for the new features and for fixes and reworks. When we were a relatively young company, the product owner's main focus was on the new features, and the smaller fixes and re-works for existing functionality used to be tucked under the rug. The limit of items allowed in the Planned state on our Kanban board was 20, and the clean-ups were triggered by the following logic: "Hey, we've got a lot more than 20 items in the Planned state! Hmm... we actually got 50 bugs there! All of them are small bug fixes and enhancements, so let's just pull them from the backlog and fix, whatever whoever wants". So, developers would pull the bugs, fix them and a new build would be out. This practice worked well, but there was one link missing.

The Imbalance in Controlling a Backlog

With time, we've had more customers and more functionality in the product, which meant more areas where fixes and re-works begged to be done. Some people in the company who interface with customers and absorb their requests (or complaints) started feeling the pressure, and they wanted to do the fixes that a product owner wouldn't consider that important. It's not only about the pressure, of course. Customers' needs must be taken into account, because they want the product to work better for them. And product owner wouldn't get a clear idea of how crucial each small request was, from where he was sitting. The disconnect between "request points of entry" and "control over backlog" became too obvious at one point. We still had one product owner controlling the backlog, and he prioritized it as per his vision. He did have an idea of which things are most requested by customers, of course, but the new functionality would still get a higher priority.  The ultimate clean-up days, where developers would pick and fix bugs at random, without knowing what customers want, were not of help any more.

We then understood that we needed a new approach to handle those smaller bugs and reworks. The backlog now had to be prioritized not only by what product owner had in mind, for the new features, but based on the priorities that were defined essentially by our customers and leads, in their exchanges with the product specialists and with our support team. Besides, the new functionality also needed some re-works, and we had split opinions on what's more  important and what should be done next. Something had to be changed in terms of backlog ownership. A person (or the people) who would control the backlog needed to be up-to-date with the priorities as identified by multiple contributors: product specialists, support team and product owner (or, later, the product board).

Who Will Do the Bug Fixes and Smaller Re-work? The Emergency Team

On the other hand, we realized that we need a dedicated task force for those re-works, not just clean-up days. That's when we formed an Emergency Team. At first we used the rotation principle for it. Each of our Feature Teams would act as an emergency team for 2 weeks. Then, at one point, it became clear that a simple rotation is not a good idea. It takes time for teams to fully get settled in the context of their work, and as the team is formed it idles at first, just like an engine, and then gains speed. The rotation assumed that exactly by that time the team had to be switched. It appeared that mere time-boxed rotation ignored this important nuance, so we decided to have a permanent emergency team. The emergency team, in our understanding, was supposed to work as a point of entry for new developers, so they could know the codebase better. However, we didn't get into account that newcomers needed to check on some things with the "old" developers. That hindered the work, and we eventually switched back to ~1 month rotation principle. One month seems to be a sensible time for the emergency team to gain and sustain the optimal productivity momentum.

The backlog for this Emergency Team was at first prioritized by product owners who would rotate weekly. One week that would be one of product specialists, then someone from a support team, then a product owner or one of UX designers. Then we realized that one rotating product owner is not working well. This person is not able to keep in mind all the priorities. Now the backlog is formed from 3 queues: Support, Product Specialists and Product Owner AND this work is prioritized at a meeting with emergency team lead developer, product owner, product specialists and support. Roughly, each of the sources has 25% share in the Emergency Team backlog. Currently this team is doing small things to improve the product on-boarding experience for potential customers, and they did the print cards functionality in the recent 3.2.4 release. For comparison, features teams are working on large features such as timelines or lists.