What is Quality Engineering?

Quality Engineering encompasses all the product quality improving practices that you incorporate into your end-to-end (E2E) product delivery process. The approach to optimizing product quality has changed dramatically in recent years, shifting away from quality’s major focus being as an “assurance” activity that happens primarily after coding/engineering is complete to an activity that is embedded throughout the product engineering lifecycle. The idea is that there are multiple benefits in shifting from Quality Assurance as an end of development cycle assurance activity to Quality being engineered into every phase of the process from ideation to customer delivery

This practice and objective to embed quality throughout the product engineering lifecycle is commonly referred to as “shifting left,” i.e. embedding testing and other quality practices more completely and deliberately in the early phases of the lifecycle. Although “shifting left” is almost synonymous with shifting quality practices to start earlier in the cycle, it generally also refers to E2E embedded quality approaches. That is, it encompasses many other innovations in approaches and tooling to achieving high product quality in an efficient manner. 

Part of this shift to continuous quality is due to an ongoing and ever evolving shift in software development from waterfall/stage gate type processes to a model of continuous incremental (agile) delivery. This evolution has been enabled and accelerated with innovations like containerization, public cloud PaaS service ‘explosion,’ distributed (often called “microservices based”) architectures (decomposing the architecture allows for the testing of smaller components indedepently) and improvements in continuous code integration and delivery automation tools. All of these innovations are powerful, but they will not optimally speed up delivery of new capabilities to customers if quality is also not embedded, automated and incremental, since having a separate quality cycle after engineering is ‘complete’ would limit speed even with other good agile practices in place.

The E2E quality philosophy is roughly that quality goes from a point in time activity to a continuous activity from the development of the product improvement concept (ideally in the format of a hypothesis to be tested) through delivering to the end customer. Anytime you take a responsibility that was assigned to one team as their primary job and make it everyone’s job, you are of course at risk that it is no one’s job and quality deteriorates. In fact, as more companies make this shift, there is often NO one person or team in the organization that has “Quality” or “Quality Assurance” in their title.

In order to successfully make this shift to incremental and continuous quality, there are a number of aspects to consider and muscles to build. Some of the key dimensions to consider or put in place include:

  • A common definition of the types of testing to be done at each phase of the agile cycle
  • The ability to track and measure defects found by phase
  • Automation frameworks to accelerate test development
  • Measurement of code coverage by automated tests
  • Clear definition of service interaction patterns so that services know how to behave if another service is unavailable
  • Disciplined engineering practices including: code review and test driven development 
  • Clear and enforced practices regarding customer impacting issues
  • Incident management process including post-mortems (~70% of incidents are causes by change in production)
  • Triage of defects found in later phases of development, especially production given the cost and impact. 
  • Vocal management support of achieving continuous quality practices, especially in established organizations that are moving from a more traditional QA model
  • Retraining of existing quality staff towards automation and enablement rather than hands on testing (especially manual testing)

Most organizations are unfortunately not able to start from scratch and rebuild their application and therefore must shift their quality practices while the plane is still in flight. A common approach to this is to establish a small central group of quality experts that can help bootstrap, train and enable across a larger engineering organization. The objective with this small central group is to “teach the stream aligned engineering teams to fish,” by helping them establish processes, tools, frameworks and metrics that support E2E quality. In addition, existing QA engineers must be evaluated for their aptitude and motivation to learn automation (and for a potential path to becoming full-fledged application developers). During a transitionary period, any people that continue to focus solely on testing activities in a scrum team should remain embedded in the scrum team. As is common in change initiatives, not everyone will successfully make the change and that is ok, but of course as a business you must manage the risk such as near-term attrition of QA people before new approaches are fully established. 

Why is Quality Engineering important?

Customers care about quality! 

Ok now that we have that out of the way, there is great deal of variability in the cost of obtaining a certain quality level which can be significantly improved by shifting to an embedded E2E quality mindset and process. The objective of this philosophy is to create fewer bugs in total and to find defects earlier in the life cycle. Research regarding the cost to fix a defect by phase of development varies but is consistent in that:

  • The cost to fix a bug becomes increasingly large as the idea moves through product development from concept, through design and engineering to being live for customers.
  • The rule of 10 is a good guide for thinking about the cost of quality as implementation progresses. Roughly this rule states that at each phase of implementation the cost to fix a defect is 10x higher than the previous phase. 
  • Lastly the cost to fix a bug at the ‘end of the line’ (in production) is orders of magnitude higher than in the early phases of idea development and can range from 30 to 100x or more expensive to fix after the code is live compared to at the beginning of the ideation phase.

In addition to cost, antiquated quality cycles after development completes result in slower time to market, which is an imperative for all companies who wish to remain competitive. 

Key points for CEOs and Board members

  1. If you have not yet made this shift to continuous quality, discuss it with your CTO and start the journey! Often this makes the most sense to embark on when you are working on modernizing your architecture as the degree you can achieve utopian end-to-end quality is impacted by the architecture of your system. 
  2. As previously outlined, establishing a small Enabling function to help support and guide the shift to continuous quality practices will help make the transition successful.
  3. You WILL have defects, well – as long as you intend to stay in business…stop asking “why do we have bugs?” or “why do we have so many bugs?” It is valid to ask “why did we have the SAME bug?” a second time but earlier questions are not useful in driving improvement.
  4. If your customer quality is poor – stop blaming QA. I have often heard “our QA team is lousy, they didn’t catch XYZ critical bug.” While this may be true, remember QA is a last minute diaper for $hit code. If ongoing quality is poor, all the more reason to start in earnest on this shift to E2E quality and work on improving your underlying architecture and code instead of blaming QA for the dirty diaper they have been handed. 
  5. Following on the theme of not defaulting to blame QA, architectural underpinnings of the application are CRITICAL to making good quality possible in an efficient way. For example, in the case of a long-lived, large and regularly changed monolithic code base, it is common for it to become harder over time to understand, make changes and troubleshoot the code in general. In this situation, high quality is usually harder to achieve given the level of complexity and dependencies and generally the cost of quality is higher since testing needs to be relied on more heavily at the end of the implementation cycle.
  6. Encourage your team to measure quality (defects introduced) THROUGHOUT the lifecycle AND report on it. This helps reinforce the importance of finding issues earlier in the cycle and actually seeing, transparently and objectively, how successfully you are achieving this outcome. It’s also useful to monitor time to actually hands on keyboard fix defects, as this time increases with code complexity/fragility.
  7. Quality goals should be shared between Engineering and Product. Another common issue in achieving quality goals is misalignment when Product is purely or heavily incented on pushing out new features but does not have joint goals with Engineering regarding customer quality (e.g. incidents, production defects, uptime etc.)
  8. Vocal senior management support of “the what and why” of making a shift to continuous quality is critical. This is true of any significant x-organizational change but cannot be understated in importance. Shifting from traditional ‘end of the line’ quality is as much of a people and cultural shift as it is process, tools and automation. Especially in the early phases of the transition, over-communication on the objectives and progress are critical to success. 

Common CEO questions about product quality

What is the right number of defects in production? And upon release? 

Quality level depends on what your product does to a large degree. If you are sending rockets to Mars, you need relatively higher quality and fewer bugs compared to most other typical web applications. But remember, as long as you are in business, you will have defects. Some research exists regarding bugs per lines of code in pre-production as well as production. For example MSFT previously achieved ~0.5 defects per 1000 lines of code in production and the space shuttle program achieved 0 defects in 500K lines of code . It is important here to consider the law of diminishing returns we well, for example “a 5% improvement in quality, may come at a 50% higher cost per feature.” So balancing quality level and cost with customer outcomes is critical to managing cost of quality and time to market.

Instead of having a hard target of defects in production, it is more useful to set objectives that align to customer outcomes such as error rates, recurrence of the same or similar bugs, uptime, change failure rate (CFR) and mean time to restore (MTTR) after a customer impacting incident . It is still useful to measure and manage production defects but the prioritization and engineering capacity dedicated to this work should be guided by the key customer outcomes. 

The Engineering team just wants to pump out new features and the QA team keeps them in check. If we get rid of QA, don’t we have the fox guarding the henhouse?

A heavy argument for a separate QA team used to be ‘checks and balances.’ This is an antiquated mindset and breeds affective conflict between teams, aka BAD conflict that results in us vs them, poor collaboration, finger pointing and tension. This risk of engineering optimizing for speed at the expense of quality can be addressed with the appropriate previously described metrics that balance speed and quality. 

What Quality Engineering is not

Quality Engineering is NO longer an end of the line activity!

Hopefully you have gleaned that quality is moving fast to an E2E and embedded activity from an end of cycle quality assurance focus. Although this can be a bit scary as you wean yourself off the quality as a security blanket concept, there are many very concrete tools, approaches and metrics that exist to help make this transition successful.

Quality Engineering is NOT just an Engineering activity  

Quality is impacted by other practices (e.g. site reliability engineering capabilities and other operational processes) but beyond actual code quality, the most critical practice to achieve quality is in Product Managements capability to define what is to be built. I have seen cases where 50%+ of production defects were issues with the product definition and/or establishing a common understanding of what to build. There are many good practices to address this and teams that are collocated or can easily collaborate in real time goes a long way to improving both the requirements definition and the common understanding of said definition. 

If you have not started your journey towards E2E quality practices, start today! Please contact us and we can help