Diagnosing and Fixing Software Development Performance
Diagnosing the cause of poor performance from your engineering team is difficult and can be costly for the organization if done incorrectly. Most everyone will agree that a high performing team is more desirable than a low performing team. However, there is rarely agreement as to why teams are not performing well and how to help them improve performance. For example, your CFO may believe the team does not have good project management and that more project management will improve the team’s performance. Alternatively, the CEO may believe engineers are not working hard enough because they arrive to the office late. The CMO may believe the team is simply bad and everyone needs to be replaced.
Often times, your CTO may not even know the root causes of poor performance or even recognize there is a performance problem until peers begin to complain. However, there are steps an organization can take to uncover the root cause of poor performance quickly, present those findings to stakeholders for greater understanding, and take steps that will properly remove the impediments to higher performance. Those steps may include some of the solutions suggested by others, but without a complete understanding of the problem, performance will not improve and incorrect remedies will often make the situation worse. In other words, adding more project management does not always solve a problem with on time delivery, but it will add more cost and overhead. Requiring engineers to start each day at 8 AM sharp may give the appearance that work is getting done, but it may not directly improve velocity. Firing good engineers who face legitimate challenges to their performance may do irreversible harm to the organization. For instance, it may appear arbitrary to others and create more fear in the department resulting in unwanted attrition. Taking improper action will make things worse rather than improve the situation.
How can you know what action to take to fix an engineering performance problem? The first step in that process is to correctly define and agree upon what good performance looks like. Good performance is comprised of two factors: velocity and value.
Velocity is defined as the speed at which the team works and value is defined as achievement of business goals. Velocity is measured in story points which represent the amount of work completed. Value is measured in business terms such as revenue, customer satisfaction or conversion. High performing engineering teams work quickly and their work has a measurable impact on business goals. High performing teams put as much focus on delivering a timely release as they do on delivering the right release to achieve a business goal.
Once you have agreement on the definition of good engineering performance, rate each of your engineering teams against the two criteria: velocity and value. You may use a chart like the one below:
Once each team has been rated, write down a narrative that justifies the rating. Here are a few examples:
Bottom Left: Velocity and Value are Low
“My requests always seem to take a long time. Even the most simple of requests takes forever. And, when the team finally gets around to completing the request, often times there are problems in production once the release is completed. These problems have negatively impacted customers’ confidence in us so not only are engineers not delivering value – they are eroding it!”
Upper/Middle Left: Velocity is Good and Value is Low
“The team does get stuff done. Of course I’d like them to go faster, but generally speaking they are able to get things done in a reasonable amount of time. However, I can’t say if they are delivering value – when we release something we are not tracking any business metrics so I have no way of knowing!”
Upper Right: Velocity is High and Value is High
“The team is really good. They are tracking their velocity in story points and have goals to improve velocity. They are already up 10% over last year. Also, they instrument all their releases to measure business value. They are actively working with product management to understand what value needs to be delivered and hypothesize with the stakeholders as to what features will be best to deliver the intended business goal. This team is a pleasure to work with.”
Unknown Velocity and Unknown Value
“I don’t know how to rate this team. I don’t know their velocity; its always changing and seems meaningless. I think the team does deliver business value, but they are not measuring it so I cannot say if it is low or high.”
With narratives in hand it’s time to begin digging for more data to support or invalidate the ratings.
Diagnosing Velocity Problems
Engineering velocity is a function of time spent developing. Therefore, the first question to answer is “what is the maximum amount of time my team is able to spend on engineering work under ideal conditions?”
This is a calculated value. For example, start with a 40 hour work week. Next, assuming your teams are following an Agile software development process, for each engineering role subtract out the time needed each week for meetings and other non-development work. For individual contributors working in an Agile process that number is about 5 hours per week (for stand up, review, planning and retro). For managers the number may be larger. For each role on the team sum up the hours. This is your ideal maximum.
Next, with the ideal maximum in hand, compare that to the actual achievement. If your teams are not logging hours against their engineering tasks, they will need to do this in order to complete this exercise. Evaluate the gap between the ideal maximum and the actual. For example, if the ideal number is 280 hours and the team is logging 200 hours, then the gap is 80 hours. You need to determine where that 80 hours is going and why. Here are some potential problems to consider:
- Teams are spending extra time in planning meetings to refine requirements and evaluating effort.
- Team members are being interrupted by customer incidents which they are required to support.
- The team must support the weekly release process in addition to their other engineering tasks.
- Miscellaneous meeting are being called by stakeholders including project status meetings and updates.
As you dig into this gap it will become clear what needs to be fixed. The results will probably surprise you. For example, one client was faced with a software quality problem. Determined to improve their software quality, the client added more quality engineers, built more unit tests, and built more automated system tests. While there is nothing inherently wrong with this, it did not address the root cause of their poor quality: Rushing. Engineers were spending about 3-4 hours per day on their engineering tasks. Context switching, interruptions and unnecessary meetings eroded quality engineering time each day. As a result, engineers rushing to complete their work tasks made novice mistakes. Improving engineering performance required a plan for reducing engineering interruptions, unnecessary meetings, and enabling engineers to spend more uninterrupted time on their development tasks.
At another client, the frequency of production support incidents were impacting team velocity. Engineers were being pulled away from their daily engineering tasks to work on problems in production. This had gone on so long that while nobody liked it, they accepted it as normal. It’s not normal! Digging into the issue, the root cause was uncovered: The process for managing production incidents was ineffective. Every incident was urgent and nearly every incident disrupted the engineering team. To improve this, a triage process was introduced whereby each incident was classified and either assigned an urgent status (which would create an interruption for the team) or something lower which was then placed on the product backlog (no interruption for the team). We also learned the old process (every incident was urgent) was in part a response to another velocity problem; stakeholders believed that unless something was considered urgent it would never get fixed by the engineering team. By having an incident triage process, a procedure for when something would get fixed based on its urgency, the engineering team and the stakeholders solved this problem.
At AKF, we are experts at helping engineering teams improve efficiency, performance, fixing velocity problems, and improving value. In many cases, the prescription for the team is not obvious. Our consultants help company leaders uncover the root causes of their performance problems, establish vision and execute prescriptions that result in meaningful change. Let us help you with your performance problems so your teams can perform at their best!