Don't Let the Tail Wag the Dog
On multiple occasions over the years, we have heard our clients state a use case they want to avoid in product design sessions or as a reason for architectural choices made for existing products. These use cases can be given more credence than they deserve based on objective data – they become boogeyman legends, edge cases that can result in poor architectural choices.
One of our clients was debating the benefit of multiple live sites with customers pinned to the nearest site to minimize latency. The availability benefits of multiple live sites are irrefutable, but the customer experience benefit of less latency was questioned. This client had millions of clients spread across the country. The notion of pinning a client to a “home” site nearest them raised the question of “what happens when the client travels across the country?”. The answer is to direct them to that same home site. That client will experience more latency for the duration of the visit. The proportion of clients that spend 50% of their time on either coast is vanishingly small – keep it simple. Have a work around for clients that permanently move to a location served by a different site – client data resides in more than one location for DR purposes anyway, right?
This client also had hundreds of service specialists that would at times access client accounts and take actions on their behalf, and these service specialists were located near the west coast. Objections were made based on the latency a west coast service specialist would encounter when acting on the behalf of an east coast client whose data was hosted near the east coast. Millions of clients. Hundreds of service specialists. The math is not hard. The needs of the many outweigh the needs of the few.
A different client had a concern about data consistency upon new user registration for their service. To ensure a new customer could immediately transact, the team decided to deploy a single authentication server to preclude the possibility of a transaction following registration hitting an authentication server that had not yet received the registration data. Intentionally deploying a SPOF should have raised immediate objections but did not. The team deployed a passive backup server that required manual intervention to work.
The new user process flow was later revealed to be less than 3% of the overall transactions. 97% of the transactions suffered an impactful outage along with the 3% new users when the SPOF authentication server failed. Designing a workaround for the new users while employing a write master with multiple, load balanced read only slaves would provide far better availability. The needs of the many outweigh the needs of the few.
It is important to remain open minded during early design sessions. It is also important to follow architectural principles in the face of such use cases. How can one balance potentially conflicting concepts?
• Ask questions best answered with objective data.
• Strive for simplicity, shave with Occam’s Razor
• Validate whether the edge case is a deal breaker for the product owner
• Propose a work around that addresses the edge case while optimizing the architecture for the majority use case and sound principles.
Catering to the needs of the business while adhering to architectural standards is a delicate balancing act and compromises will be made. Everyone looks at the technologist when a product encounters a failure. Know when to hold the line on sound architectural principles that safeguard product availability and user experience. The product owner must understand and acknowledge the architectural risks resulting from product design decisions. The technologist must communicate these risks to the product owner along with objective data and options. A failure to communicate effectively can lead to the tail wagging the dog – do not let that happen.
With 12 years of product architecture and strategy experience, AKF Partners is uniquely positioned to be your technology partner. Learn more here.