AKF Partners

Abbott, Keeven & Fisher PartnersPartners In Hyper Growth

Archives from author » fish

Why Airline On-Time Reliability Is Not Likely To Improve

As a frequent traveler who can exceed 250K air miles per year, I experience flight delays almost every week. Waiting for a delayed flight recently allowed me time to think about how airlines experience the multiplicative effect of failure just like our systems do. If you’re not familiar with this phenomenon, think back to holiday lights that your parents put up. If any one bulb burnt out none of the string of lights worked. The reason was that all the 50 lights in the string were wired in series, requiring all of them to work or none of them worked. Often our systems are designed such that they require many servers, databases, network devices, etc. to work for every single request. If any device or system is unavailable in that request-response chain, the entire process fails. The problem with this is that if five devices each have 99.99% availability, the total availability of the overall system is that availability raised to the power of the number of devices, i.e. 99.99%^5 = 99.95%. This has become even more of a problem with the advent of micro-services (see our paper on how to fix this problem, http://reports.akfpartners.com/2016/04/20/the-multiplicative-effect-of-failure-of-microservices/). Now, let’s get back to airlines.

Airlines spend a lot on airplanes, the average Boeing 737-900 costs over $100M (http://www.boeing.com/company/about-bca/index.page%23/prices), and they want to make use of those assets by keeping them in the air and not on the ground. To accomplish this, airlines schedule airplanes on back-to-back trips. For example, a recent United flight departing SFO for SAN was being serviced by an Airbus A320 #4246. This aircraft had the following schedule for the day.

Depart Arrive
SFO 8:05am SAN 9:44am
SAN 10:35am SFO 12:15pm
SFO 1:20pm SEA 3:28pm
SEA 4:18pm SFO 6:29pm
SFO 7:30pm SAN 9:05pm

If I’m on the last flight of the day (SFO->SAN), each previous flight has to be “available” (on-time) in order for me to have any chance of my request (flight) being serviced properly (on-time). As you can undoubtedly see, this is aircraft schedule (SFO->SAN->SFO->SEA->SFO->SAN) is very similar to a request-response chain for a website (browser->router->load balancer->firewall->web server->app server->DB). The availability of the entire system is the multiplied availability of each component. Is it any wonder why airlines have such dismal availability? See the chart below for Aug 2016, with El Al coming in at a dismal 36.81%, my preferred airline, United, coming in at 77.88%, and the very best KLM at only 91.07%.

airline_ontime(Source: http://www.flightstats.com/company/monthly-performance-reports/airlines/).

So, how do you fix this? In our systems, we scale by splitting our systems by either common attributes such as customers (Z-axis) or different attributes such as services (Y-axis). In order to maintain high availability we ensure that we have fault isolation zones around the splits. Think of this as taking the string of 50 lights, breaking it into 5 sets of 10 lights and ensuring that no set affects another set. This allows us to scale up by adding sets of 10 lights while maintaining or even improving our availability.

Airlines could achieve improved customer perceived on-time reliability by essentially splitting their flights by customers (Z-axis). Instead of large airplanes such as the Boeing 737-900 with 167 passengers, they could fly two smaller planes such as the Embraer ERJ170-200 each with only 88 passengers. When one plane is delayed only half the passengers are affected. Airlines are not likely to do this because in general the larger airplanes have a lower cost per seat than smaller airplanes. The other way to increase on-time reliability would be to fault isolate the flight segments. Instead of stacking flights back-to-back with less than one hour between arrival and the next scheduled departure, increase this to a large enough window to makeup for minor delays. Alternatively, at hub airports, introduce a swap aircraft into the schedule to break the dependency chain. Both of these options are unlikely to appeal to airlines because they would also increase the cost per seat.

The bottom line is that there are clearly ways in which airlines could increase their on-time performance but they all increase costs. We advise our clients, that they need the amount of performance and availability that improves customer acquisition or retention but no more. Building more scalability or availability into the product than is needed takes resources away from features that serve the customer and the business better. Airlines are in a similar situation, they need the amount of on-time reliability that doesn’t drive customers to competitors but no more than that. Otherwise the airlines would have to either increase the price of tickets or reduce their profits. Given the dismal on-time reliability, I just might be willing to pay more for a better on-time reliability.


Comments Off on Why Airline On-Time Reliability Is Not Likely To Improve

Making Agile Predictable

One of the biggest complaints that we hear from businesses implementing agile processes is that they can’t predict when things will get delivered. Many people believe, incorrectly, that waterfall is a much more predictable development methodology. Let’s review a few statistics that demonstrate the “predictability” of the waterfall methodology.

In a study of over $37 billion (USD) worth of US Defense Department projects concluded that: 46% of the systems so egregiously did not meet the real needs (although they met the specifications) that they were never successfully used, and another 20% required extensive rework (Larman, 1995).

In another study of 6,700 projects it was found that four out of five key factors contributing to project failure were associated with or aggravated by the waterfall method, including inability to deal with changing requirements and problems with late integration.

Waterfall is not a bad or failed methodology, it’s just a tool like any other that has its limitations. We as the users of that tool misused it and then blamed the tool. We falsely believed that we could fix the project scope (specifications), cost (team size), and schedule (delivery date). No methodology allows for that. As shown in the diagram below, waterfall is meant to fix scope and cost. When we also try to constrain the schedule we’re destined to fail.

fig1

Agile when used properly fixes the cost (team size) and the schedule (2 week sprints), allowing the scope to vary. This is where most organizations struggle, attempting to predict delivery of features when the scope of stories is allowed to vary. As a side note, if you think you can fix the scope and the schedule and vary the cost (team size) read Brooke’s 1975 book The Mythical Man-Month.

This is where the magical measurement of velocity comes in to play. The velocity of a team is simply the number of units of work completed during the sprint. The primary purpose of this measurement is to provide the team with feedback on their estimates. As you can see in the graph below it usually takes a few sprints to get into a controlled state where the velocity is predictable and then it usually rises slightly over time as the team becomes more experienced.

fig2

Using velocity we can now predict when features will be delivered. We simply project out the best and worst velocities and we can demonstrate with high certainly a best and worst delivery date for a set of stories that make up a feature.

fig3

Velocity helps us answer two types of questions. The first is the fixed scope question “when will we have X feature?” to which the answer is “between A and B dates”. The second question is the fixed time question “what will be delivered by the June releases?” to which the answer is “all of this, some of that, and none of those.” What we can’t answer is fixed time and fixed scope questions.

fig4

It’s important to remember is that agile is not a software development methodology, but rather a business process. This means that all parts of your organization must buy-in to and participate in agile product development. Your sales team must get out of the mindset of committing to product new features with fixed time and scope when talking to existing or potential customers. When implemented correctly, agile provides faster time to market and higher levels of innovation than waterfall, which brings greater value to your customers. The tradeoff from a sales side is to change behavior from making long-term product commitments as they did in the past (but were more often than not missed anyway)!

By utilizing velocity, keeping the team consistent, and phrasing the questions properly, agile can be a very predictable methodology. The key is understanding the constraints of the methodology and working within them instead of ignoring them and blaming the methodology.


1 comment

Data Driven Product Development

When we bring up the idea of data driven product development, almost every product manager or engineering director will quickly say that of course they do this. Probing further we usually hear about A/B testing, but of course they only do it when they aren’t sure what the customer wants. To all of this we say “hogwash!” Testing one landing page against another is a good idea, but it’s a far cry from true data driven product development. The bottom line is that if your agile teams don’t have business metrics (e.g. increase the conversion rate by 5% or decrease the shopping cart abandonment rate by 2%) that they measure after every release to see if they are coming closer to achieving, then you are not data driven. Most likely you are building products based on the HiPPO. The Highest Paid Person’s Opinion. In case that person isn’t always around, there’s this service.
hippo

Let’s break this down. Traditional agile processes have product owners seeking input from multiple sources (customers, stakeholders, marketplace). All these diverse inputs are considered and eventually they are combined into a product vision that helps prioritize the backlog. In the most advanced agile processes, the backlog is ranked based on effort and expected benefit. This is almost never done. And if it is done, the effort and benefit are never validated afterwards to see how accurate the estimates where. We even have an agile measurement for doing this for the effort, velocity. Most agile teams don’t bother with the velocity measurement and have no hope of measuring the expected benefit.

Would we approach anything else like this? Would we find it acceptable if other disciplines worked this way? What if doctors practiced medicine with input from patients and pharmaceutical companies but without monitoring how their actions impacted the patients’ health? We would probably think that they got off to a good start, understanding the symptoms from the patient and the possible benefits of certain drugs from the pharmaceutical representatives. But, they failed to take into account the most important part of the process. How their treatment was actually performing against their ultimate goal, the improved health of the patient.
doctor

We argue that you’re wasting time and effort doing anything that isn’t measured by a success factor. Why code a single line if you don’t have a hypothesis that this feature, enhancement, story, etc. will benefit a customer and in turn provide greater revenue for the business. Without this feedback mechanism, your effort would be better spent taking out the trash, watering the plants, and cleaning up the office. Here’s what we know about most people’s gut instinct with regards to products, they stink. Read, Lund’s very first Usability Maxims “Know thy user, and YOU are not thy user.” What does that mean? It means that you know more about your product, industry, service, etc. than any user ever will. Your perspective is skewed because you know too much. Real users don’t know where everything is on the page, they have to search for it. Real users lie when you ask them what they want. Real users have bosses, spouses, and children distracting them while they use your product. As Henry Ford supposedly said, “If I had asked customers what they wanted, they would have said a faster horse.” You can’t trust yourself and you can’t trust users. Trust the data. Form a hypothesis “If I add a checkout button here, shoppers will convert more.” Then add the button and watch the data.
checkout_button

This is not A/B or even multivariate testing. It is much more than that. It’s changing the way to develop products and services. Remember, Agile is a business process, not just a development methodology. Nothing gets shipped without an expected benefit that is measured. If you didn’t achieve the business benefit that was expected you either try again next sprint or you deprecate the code and start over. Why leave code in a product that isn’t achieving business results? It only serves to bloat the code base and make maintenance harder. Data driven product development means we don’t celebrate shipping code. That’s suppose to happen every two weeks or even daily if we’re practicing continuous delivery. You haven’t achieved anything pushing code to production. When the customer interacts with that code and a business result is achieved then we celebrate.

Data driven product development is how you explain the value of what the product development team is delivering. It’s also how you ensure your agile teams are empowered to make meaningful impact for the business. When the team is measured against real business KPIs that directly drive revenue, it raises the bar on product development. Don’t get caught in the trap of measuring success with feature releases or sprint completions. The real measurement of success is when the business metric shows the predicted improvement and the agile team becomes better informed of what the customers really want and what makes the business better.


Comments Off on Data Driven Product Development

Definition of MVP

We often use the term minimum viable product or MVP but do we all agree on what it means? In the Scrum spirt of Definition of Done, I believe the Definition of MVP is worth stating explicitly within your tech team. A quick search revealed these three similar yet different definitions:

  • A minimum viable product (MVP) is a development technique in which a new product or website is developed with sufficient features to satisfy early adopters. The final, complete set of features is only designed and developed after considering feedback from the product’s initial users. Source: Techopedia
  • In product development, the minimum viable product (MVP) is the product with the highest return on investment versus risk…A minimum viable product has just those core features that allow the product to be deployed, and no more. Source: Wikipedia
  • When Eric Ries used the term for the first time he described it as: A Minimum Viable Product is that version of a new product which allows a team to collect the maximum amount of validated learning about customers with the least effort. Source: Leanstack

I personally like a combination of these definitions. I would choose something along the lines of:

A minimum viable product (MVP) has sufficient features to solve a problem that exists for customers and provides the greatest return on investment reducing the risk of what we don’t know about our customers.

Just like no two teams implement Agile the same way, we don’t all have to agree on the definition of MVP but all your team members should agree. Otherwise, what is an MVP to one person is a full featured product to another. Take a few minutes to discuss with your cross-functional agile team and come to a decision on your Definition of MVP.


Comments Off on Definition of MVP

Building Scalable Organizations

Embedded below is a presentation on building scalable organizations. The presentation walks through our model of innovation detailing how factors such as cognitive conflict, affective conflict, and a sense of empowerment impact a company’s ability to be innovative. The slides then depict how in most functional based organizations there is conflict when trying to deliver a service such as a SaaS product offering. The remainder of the presentation focuses on how to fix this with some real world examples of companies that have reorganized to focus on improving innovation.



Comments Off on Building Scalable Organizations

Explaining the ALS Ice-Bucket Challenge

By now you’ve undoubtedly seen all of the ALS ice-bucket challenge videos that have been popping up on our social media feeds. I like most people enjoy watching my friends dump buckets of ice on their heads but what I like about this fund-raising campaign even more is that it is a brilliant example of viral growth. If we take a quick look at the power of viral growth we can easily see why the ALS Association reported that it raised $15.6 million as a result of the challenge, nine times what it normally raises in the same time frame. Project ALS and ALS TDI, two other ALS charities, reported donations up between 10 and 50 times normal just since the beginning of August.

For viral growth we start with a simple formula for our viral coefficient (Cv), which is essentially how effective our campaign is at spreading from one person to another. In the equation we have Fan Out, how many people one person asks to join, and Conversion Rate, how many people actually join. Most people making ALS ice-bucket challenge videos call out or invite 3 other people to join. I suspect not everyone converts, which as I understand the rules is even better for the charity because you are supposed to donate $10 if you do the challenge and $100 if you don’t. However, it appears that the Conversion Rate is very high; my guess is as high as 75%. This would yield a viral coefficient of 3 x 0.75 = 2.25. For every one person that produces a video and donates they attract 2.25 people on average to also produce a video, donate, and nominate 3 other people.

    Cv=Fan Out*Conversion Rate

In order to see the cumulative effect of this spread we turn to one more formula, Cumulative Donors. This is simple the Viral Coefficient multiplied by the Retention Rate, which matters a lot if you are a social media site that needs users coming back day after day but for donations just participating in the challenge is sufficient so we can use 100% for that variable. One of the cleverest ideas of the ice-bucket challenge is calling people out and giving them only 24-hours to accept the challenge. This forces the cycle of fan out to be very high. In our equation the frequency of usage is 1 since people only make one video and then the cycle is 1 day (24 hours).

    Cumulative Donors=(Cv*Retention Rate)^(Frequency*Cycle)

If we start with one person on day 1, it requires just a few hours past the 20th day for the number of cumulative donors to exceed the world’s population of approximately 7 billion people (see the graph below). Pretty nice for a fund-raising campaign! Of course, all the interesting growth occurs that last few days. You might even notice that this week your social media feed will be inundated with new videos.

als_graph

What the graph above does not show is that adoption of anything (technologies to fund raisers) is that the curve is really an ‘S’ curve in that once the adoption hits a maximum it flattens out. There is only a certain number of users, customers, or donors that will ever user your product or donate to your cause. But it’s certainly a fun ride getting there.

If you find this interesting you’d probably enjoy our latest book, The Power of Customer Misbehavior.


1 comment

Keep Asking Why

We’ve written about After Action Reviews and Postmortems before but I thought it would be worth providing an updated example of the importance of getting to true root causes. Yes, that’s plural “causes” because there are almost always multiple root causes, not just one.

I recently watched a postmortem that followed AKF’s recommended T-I-A process of creating a timeline, identifying the issues, and assigning action items to owners. However, the list of issues included such things as “deployment script failed to stop when a node was down.” While this is certainly a contributing issue to the incident, it is not a root cause. It’s not a root cause in that if you fix the script but not the process that allowed the script to fail in the first place, you’re only solving for a very narrow problem. If you dig deeper (keep asking “why”), you’ll soon realize that there are truer root causes perhaps there is no process for code reviewing scripts or there is no process for testing scripts or there is no build vs. buy decision for scripting tools whereby you’re having to home-grow all the tools. These root causes, once solved, will fix a much broader set of problems that you will encounter instead of just fixing a broken script.

In order to be a true learning organization you need to dig deep. Get to true root cause by keep asking “why.” You know you’ve achieved the correct depth when solving the issue will fix many failure modes and not just the one that caused the incident.


Comments Off on Keep Asking Why

Talent Search: Tech Lead, Princeton Review

The Princeton Review technology team is looking for a highly skilled lead engineer to help start up an Agile development team in Natick, Massachusetts. This person will be responsible for leading a multi-function team with direct reports to design the next generation of the web-site and online products. This is a great opportunity to work in an environment where you will have the opportunity to help shape and grow the engineering team. The candidate should have 2 to 5 years experience in a lead development role and experience with Web/Java Framework technologies, such as Spring, Struts, Hibernate, Tomcat, CXF, JSP, Tiles, HTML, Javascript, and Javascipt MVC frameworks is desired.

Interested candidates are requested to send their resume directly to: Tom Gernon. Click here for more details on the position.


Comments Off on Talent Search: Tech Lead, Princeton Review

Recent Interviews

Below is a list of recent interviews by AKF Partners, Mike Fisher & Marty Abbott on a variety of topics including scalability, healthcare.gov, and customer misbehavior:

Make sure you follow Marty and Mike  and AKF’s Facebook Pages to keep up to date with the latest.

@martyabbott

@MikeFisher_AKF

Power of Customer Misbehavior

AKF Partners

 


Comments Off on Recent Interviews

The Phoenix Project – Book Review

After several friends recommended that I read the book The Phoenix Project, I at last downloaded it to my kindle and read it on one of my transcontinental flights. I really enjoyed it and thought it was both entertaining and enlightening which is difficult to accomplish. However, I don’t believe it goes far enough in describing how the most modern and efficient technology shops run. More about that later, first a recap.

The story line is that the fictional company, Parts Unlimited, is failing to keep up with competition and at risk of being split up and sold off. The root cause of the company’s issues are IT. We join the story after two IT executives are fired and the hero of our story, Bill, is asked to take over as the VP of IT Operations. He is introduced to his sensei, Erik, a tech / business / process genius, who leads Bill through the path to enlightenment by comparing IT Operations to a manufacturing plant. Through this journey the reader is introduced to concepts such as kanban, work-in-process (WIP), work centers, Theory of Constraints, Lean Startup, and the overarching concept of DevOps. In fact the book ends with most of the main characters at a party where Bill reflects “Maybe everyone attending this party is a form of DevOps, but I suspect it’s something much more than that. It’s Product Management, Development, IT Operations, and even Information Security all working together and supporting one another.”

I’m a fan of DevOps but to me the integration of tech ops and development is one small step for technology driven companies. In order to truly be nimble and effective, teams need to incorporate development, tech ops, product management, quality assurance, and any other function such as marketing or analytics that are needed to make the team autonomous. We’ve written about this concept several times The Agile Org, The Agile Org Solution, and The Agile Exec. We call these teams PODs and most importantly they need to be aligned to swim lanes in the architecture. By doing this, you essentially create mini-startups. These teams can make decisions themselves, deploy independently, and are responsible for their service from idea to production support.

I really enjoyed the book The Phoenix Project and I too highly recommend that you read it, especially if you are not familiar with the concepts mentioned above. As the authors are obviously talented writers and technologist, I would love to see them write another book that takes Parts Unlimited to the next level by creating autonomous teams. The authors hint at this concept when they state

“You’ve helped me see that IT is not merely a department. Instead, it’s pervasive, like electricity. It’s a skill, like being able to read or do math. Here at Parts Unlimited, we don’t have a centralized reading or math department—we expect everyone we hire to have some mastery of it. Understanding what technology can and can’t do has become a core competency that every part of this business must have. If any of my business managers are leading a team or a project without that skill, they will fail.”

Hopefully this is foreshadowing of another book to come.


Comments Off on The Phoenix Project – Book Review