May 10, 2018 | Posted By: Pete Ferguson
Three Reasons Your Software Engineers May Not Be Successful
At AKF Partners, we have the unique opportunity to see trends among startups and well-established companies in the dozens of technical due diligence and more in-depth technology assessments we regularly perform, in addition to filling interim leadership roles within organizations. Because we often talk with a variety of folks from the CEO, investors, business leadership, and technical talent, we get a unique top-to-bottom perspective of an organization.
Three common observations
- People mostly identify with their job title, not the service they perform.
- Software Engineers can be siloed in their own code vs. contributing to the greater outcome.
- CEO’s vision vs. frontline perception of things as they really are.
Job Titles Vs. Services
The programmer who identifies herself as “a search engineer” is likely not going to be as engaged as her counterpart who describes herself as someone who “helps improve our search platform for our customers.”
Shifting focus from a job title to a desired outcome is a best practice from top organizations. We like to describe this as separating nouns and verbs – “I am a software engineer” focuses on the noun without an action: software engineer instead of “I simplify search” where the focus is on verb of the desired outcome: simplify. It may seem minor or trivial, but this shift can be a contributing impact on how team members understand their contribution to your overall organization.
Removing this barrier to the customer puts team members on the front line of accountability to customer needs – and hopefully also the vision and purpose of the company at large. To instill a customer experience, outcome based approach often requires a reworking of product teams given our experience with successful companies. Creating a diverse product team (containing members of the Architecture, Product, QA and Service teams for example) that owns the outcomes of what they produce promotes:
- Creating products customers love
If you have had experience in a Ford vehicle with the first version of Sync (bluetooth connectivity and onscreen menus) – then you are well aware of the frustration of scrolling through three layers of menus to select “bluetooth audio” ([Menu] -> [OK] -> [OK] -> [Down Arrow]-> [OK] -> [Down Arrow] -> [OK]) each time you get into your car. The novelty of wireless streaming was a key differentiator when Sync first was introduced – but is now table stakes in the auto industry – and quickly wears off when having to navigate the confusing UI likely designed by product engineers each focused on a specific task but void of designing for a great user experience. What was missing is someone with the vision and job description: “I design wireless streaming to be seamless and awesome - like a button that says “Bluetooth Audio!!!”
Hire for – and encourage – people who believe and practice “my real job is to make things simple for our customers.”
Avoiding Siloed Approach
Creating great products requires engineers to look outside of their current project specific tasks and focus on creating great customer experiences. Moving from reactively responding to customer reported problems to proactively identifying issues with service delivery in real time goes well beyond just writing software. It moves to creating solutions.
Long gone are the “fire and forget” days of writing software, burning to a CD and pushing off tech debt until the next version. To Millennials, this Waterfall approach is foreign, but unfortunately we still see this mentality engrained in many company cultures.
Today it is all about services. A release is one of many in a very long evolution of continual improvement and progression. There isn’t Facebook V1 to be followed by V2 … it is a continual rolling out of upgrades and bug fixes that are done in the background with minimum to no downtime. Engineers can’t afford to be laggard in their approach to continual evolution, addressing tech debt, and contributing to internal libraries for the greater good.
Ensure your technical team understands and is very closely connected to the evolving customer experience and have skin in the game. Among your customers, there likely is very little patience with “wait until our next release.” They expect immediately resolution or they will start shopping the competition.
Translating the Vision of the CEO to the Front Lines
During our our more in-depth technology review engagements we interview many people from different layers of management and different functions within the organization. This gives us a unique opportunity to see how the vision of the CEO migrates down through layers of management to the front-line programmers who are responsible for translating the vision into reality.
Usually - although not always - the larger the company, the larger the divide between what is being promised to investors/Wall Street and what is understood as the company vision by those who are actually doing the work. Best practices at larger companies include regular all-hands where the CEO and other leaders share their vision and are held accountable to deliverables and leadership checks that the vision is conveyed in product roadmaps and daily stand up meetings. When incentive plans focus directly on how well a team and individual understand and produce products to accomplish the company vision, communication gaps close considerably.
Creating and sustaining successful teams requires a diverse mix of individuals with a service mindset. This is why we stress that Product Teams need to be all inclusive of multiple functions. Architecture, Product, Service, QA, Customer Service, Sales and others need to be included in stand up meetings and take ownership in the outcome of the product.
The Dev Team shouldn’t be the garbage disposal for what Sales has promised in the most recent contract or what other teams have ideated without giving much thought to how it will actually be implemented.
When your team understands the vision of the company - and how customers are interacting with the services of your company - they are in a much better position to implement it into reality.
As a CTO or CIO, it is your responsibility to ensure what is promised to Wall Street, private investors, and customers is translated correctly into the services you ultimately create, improve, and publish.
As we look at new start-ups facing explosive 100-200% year-over-year growth, our question is always “how will the current laser focus vision and culture scale?” Standardization, good Agile practices, understanding technical debt, and creating a scalable on-boarding and mentoring process all lend to best answers to this question.
When your development teams are each appropriately sized, include good representation of functional groups, each team member identifies with verbs vs. nouns (“I improve search” vs. “I’m a software engineer”), and understand how their efforts tie into company success, your opportunities for success, scalability, and adaptability are maximized.
Do You Know What is Negatively Affecting Your Engineers’ Productivity? Shouldn’t You?
Enabling Time to Market (TTM) With Contributor Model Teams
Experiencing growing or scaling pains? AKF is here to help! We are an industry expert in technology scalability, due diligence, and helping to fill leadership gaps with interim CIO/CTO and other positions in addition to helping you in your search for technical leaders. Put our 200+ years of combined experience to work for you today!
May 6, 2018 | Posted By: Dave Berardi
Enabling TTM With Contributor Model Teams
We often speak about the benefits of aligning agile teams with the system’s architecture. As Conway’s Law describes, product/solution architectures and organizations cannot be developed in isolation. (See https://akfpartners.com/growth-blog/conways-law) Agile autonomous teams are able to act more efficiently, with faster time to market (TTM). Ideally, each team should be able to behave like a startup with the skills and tools needed to iterate until they reach the desired outcome.
Many of our clients are under pressure to achieve both effective TTM and reduce the risk of redundant services that produce the same results. During due diligence, we will sometimes discover redundant services that individual teams develop within their own silo for a TTM benefit. Rather than competing with priorities and waiting for a shared service team to deliver code, the team will build their own flavor of a common service to get to market faster.
Instead, we recommend a shared service team own common services. In this type of team alignment, the team has a shared service or feature on which other autonomous teams depend. For example, many teams within a product may require email delivery as a feature. Asking each team to develop and operate its own email capability would be wasteful, resulting in engineers designing redundant functionality leading to cost inefficiencies and unneeded complexity. Rather than wasting time on duplicative services, we recommend that organizations create a team that would focus on email and be used by other teams.
Teams make requests in the form of stories for product enhancements that are deposited in the shared services team’s backlog. (email in this case) To mitigate the risk of having each of these requesting teams waiting for requests to be fulfilled by the shared services team, we suggest thinking of the shared services as an open source project or as some call it – the contributor model.
Open sourcing our solution (at least internally) doesn’t mean opening up the email code base to all engineers and letting them have at it. It does mean mechanisms should be established to help control the quality and design for the business. An open source project often has its own repo and typically only allows trusted engineers, called Committers, to commit. Committers have Contribution Standards defined by the project owning team. In our email example, the team should designate trusted and experienced engineers from other Agile teams that can code and commit to the email repo. Engineers on the email team can be focused on making sure new functionality aligns with architectural and design principles that have been established. Code reviews are conducting before its accepted. Allowing for outside contribution will help to mitigate the potential bottleneck such a team could create.
Now that the development of email has been spread out across contributors on different teams, who really owns it?
Remember, ownership by many is ownership by none. In our example, the email team ultimately owns the services and code base. As other developers commit new code to the repo, the email team should conduct code, design, and architectural reviews and ultimately deployments and operations. They should also confirm that the contributions align with the strategic direction of the email mission. Whatever mechanisms are put in place, teams that adopt a contributor model should be a gas pedal and not a brake for TTM.
If your organization needs help with building an Agile organization that can innovate and achieve competitive TTM, we would love to partner with you. Contact us for a free consultation.
May 4, 2018 | Posted By: Marty Abbott
Many of our clients use the term “Non-Functional Requirements” to group into a basket those portions of their solution that don’t easily fit into method or function-based execution of market needs. Examples of non-functional requirements often include things like “Availability”, “Scalability”, “Response Time”, “Data Sovereignty” (as codified within requirements such as the GDPR), etc. Very often, these “NFRs” are relegated to second class citizens within the development lifecycle and, if lucky, end up as a technical debt item to be worked on later. More often than not, they are just forgotten until major disaster strikes. This happens so often that we at AKF joke that “NFR” really stands for “No F-ing Resources (available to do the job)”.
While I believe that this relegation to second class citizen is a violation of fiduciary responsibility, I completely understand how we’ve collectively gotten away with it for so long. For most of the history of our products, we’ve produced solutions for which customers are responsible for running. We built the code, shipped it, and customers installed it and ran it on their systems.
Fortunately (for most of us) the world has changed to the SaaS model. As subscribers, we no longer bear the risk of running our own systems. Implementation is easier and faster, costs of running solutions lower.
Unfortunately (for most of us) these changes also mean we are now wholly accountable for NFRs. We now mostly produce services, for which we are wholly accountable for the complete outcomes including how our solution runs.
Most NFRS Are Table Stakes and Must-Haves
In the world of delivering services, most NFR capabilities are must-haves. SaaS companies provide a utility, and the customer expectation is that utility will be available whenever the customer needs it. You simply do not have an option to decide to punt attributes like availability, or regulatory compliance to a later date.
The Absolute Value of NFR Your Product/Service Needs Varies
While we believe most NFRs are necessary, and non-negotiable for playing in the SaaS space, the amount that you need of each of them varies with the portion of the market you are addressing within the Technology Adoption Lifecycle.
As you progress from left to right into a market, the NFR expectations of the adopters of your solution increase. Innovators care more about the differentiating capability that you offer than they do the availability of your solution. That said, they still need to be able to use your product and will stop using it or churn if it doesn’t meet their availability, response time, data sovereignty, and privacy needs. NFRs are still necessary – they just matter less to innovators than later adopters.
At the other end of the extreme, Late Majority and Laggard adopters care greatly about NFRs. Whereas Innovators may be willing to grudgingly live with 99.8% availability, the Late Majority will settle for nothing less than 99.95% or better.
It’s Time to Eliminate the Phrase Non-Functional Requirement
We believe that even the name “Non-Functional Requirement” somehow implies that necessary capabilities like availability and data sovereignty can somehow take a back seat to other activities within the solutions we create. At the very least, the term fails to denote the necessity (some of them legally so) of these attributes.
We prefer names like “Table Stakes” or “Must Have Requirements” to NFRs.
It’s Also Time to Eliminate the Primary Cause
While we sometimes find that teams simply haven’t changed their mindset to properly understand that Table Stakes aren’t optional investments, we more often find a more insidious cause: Moral Hazards. Moral hazards exist when one person makes a decision for which another must bear the cost: Person A decides to smoke, but Person B bears the risk of cancer instead of Person A.
Commonly, we see product managers with ownership over product decisions, but no accountability for Table Stakes like availability, response time, cost effectiveness, security, etc. The problem with this is that, as we’ve described, the Table Stakes are the foundation of the Maslow’s Needs for online products. Engineering teams and product teams should jointly own all the attributes of the products they co-create. Doing so will help fix the flawed notion that Table Stakes can be deferred.
AKF Partners helps clients build highly available, scalable, fast response time solutions that meet the needs of the portion of the Technology Adoption Lifecycle they are addressing.
May 3, 2018 | Posted By: Marty Abbott
Scientists, Engineers, and Technicians
What is, or perhaps should be, the difference between a Computer Scientist, Data Scientist, Software Engineer, and Programmer? What should your expectations be of each? How should they work together?
To answer these questions, we’ll look at the differences between scientists, engineers, and technicians in more mature disciplines, apply them to our domain, and offer suggestions as to expectations.
Science and Scientists
The primary purpose of science is “to know”. Knowing, or the creation of knowledge, is enabled through discovery and the practice of the scientific method. Scientists seek to know “why” something is and “how” that something works. Once this understanding of “why and how” are generally accepted, ideally they are codified within theories that are continually tested for validity over time.
To be successful, scientists must practice both induction and deduction. Induction seeks to find relationships for the purposes of forming hypotheses. Deduction seeks to test those hypotheses for validity.
In the physical world (or in physical, non-biological, and non-behavioral sciences), scientists are most often physicists and chemists. They create the knowledge, relationships, and theories upon which aerospace, chemical, civil, electrical, mechanical, and nuclear engineers rely.
For our domain, scientists are mathematicians, computer scientists – and more recently – data scientists. Each seeks to find relationships and approaches useful for engineers and technicians to apply for business purposes. The scientist “finds”, the engineer “applies”. Perhaps the most interesting new field here is that of data science. Whereas most science is focused on broad discovery, data science is useful within a business as it is meant to find relationships between various independent variables and desirable outcomes or dependent variables. These may be for the purposes of general business insights, or something specific like understanding what items (or SKUs) we should display to different individuals to influence purchase decisions.
True scientists tend to have doctorates, the doctoral degree being the “certification” that one knows how to properly perform research. There are of course many examples of scientists without doctoral degrees – but these days that is rare in any case other than data scientists (here the stakes are typically lower).
Engineering and Engineers
The primary purpose of engineering is to “create” or to “do”. Engineers start with an understanding (created by scientists) of “why” things work, and “how” they work (scientific theories – often incorrectly called “laws” by engineers) and apply them for the purposes of creating complex solutions or products. Mechanical engineers rely on classical physics, electrical engineers rely on modern physics. Understanding both the “why” and “how” are important to be able to create highly reliable solutions, especially in new or unique situations. For instance, it is important for electrical engineers to understand field generation for micro-circuitry and how those fields will affect the operation of the device in question. Civil and mechanical engineers need to understand the notion of harmonic resonance in order to avoid disasters like the Tacoma Narrows bridge failure.
The domain of “software engineering” is much more confusing. Unlike traditional engineering domains, software engineering is ill-defined and suffers from an overuse of the term in practice. If such a domain truly existed, it should follow the models of other engineering disciplines. As such, we should expect that software engineers have a deep understanding of the “whys and hows” derived from the appropriate sciences: computer science and mathematics. For instance, computer scientists would identify and derive interesting algorithms and suggest applications, whereas engineers would determine how and when to apply the algorithms for maximum effect. Computer scientists identify unique scenarios (e.g. the dining philosophers problem) that broadly define a class of problems (mutual exclusion in concurrency) and suggest approaches to fix them. Engineers put those approaches into practice with the constraints of the system they are developing in mind.
Following this logic, we should also expect our engineers to understand how the systems upon which they apply their trade (computers) truly function – the relationship between processors, memory, storage, etc. Further, they should understand how an operating system works, how interpreters and compilers work, and how networks work. They should be able to apply all these things to come up with simple designs that limit the blast radius of failure, ensure low latency response, and are cost effective to produce and maintain. They should understand the depth of components necessary to deliver a service to a customer – not just the solitary component upon which they work (the code itself).
Very often, to be a successful engineer, one must have at least a bachelor’s degree in an engineering domain. The degree at least indicates that one has proven to some certifying body that they understand the math, the theories, and the application of those theories to the domain in question. Interestingly, as a further example of the difference between engineering and science, some countries issue engineering degrees under the degree of “Bachelors of Applied Science”.
There are of course many famous examples of engineers without degrees – for instance, the Wright Brothers. The true test isn’t the degree itself, but whether someone understands the depth and breadth of math and science necessary to apply science in highly complex situations. Degrees are simply one way of ensuring an individual has at least once proven an understanding of the domain.
Practical Applications and Technicians
Not everything we produce requires an engineer’s deep understanding of both why and how something works. Sometimes, the application of a high-level understanding of “how” is sufficient. As such, in many domains technicians augment engineers. These technicians are responsible for creating solutions out of reusable building blocks and a basic understanding of how things work. Electricians for instance are technicians within the domain of electrical engineering. Plumbers are a very specific application of civil engineering. HVAC technicians apply fluid mechanics from mechanical engineering.
These trades take a very technical skill, implementation specific tradecraft, and a set of heuristics to design, implement, and troubleshoot systems that are created time and time again. Electricians, for instance, design the power infrastructure of homes and offices – potentially reviewed by an electrical engineer. They then implement the design (wire the building) and are also responsible for troubleshooting any flaws. The same is true for HVAC technicians and plumbers.
Programmers are the technicians for software engineers (engineering domain) and computer and data scientists (science domain). Not everything we develop needs a “true engineer”. Very often, as is the case with wiring a house, the solution is straight forward and can be accomplished with the toolset one gains with several weeks of training. The key difference between an engineer and a programmer is again the depth and breadth of knowledge.
Technicians are either trained through apprenticeship or through trade schools. Electricians can come from either approach. Broadly speaking, it makes sense to use technicians over using an engineer for at least 2 reasons:
- Some things don’t require an engineer, and the cost of an engineer makes no sense for these applications.
- The supply of engineers is very low relative to demand within the US across all domains. During the great recession, engineers were one of the only disciplines at or near economic full employment – a clear indication of the supply/demand imbalance.
Implications and Takeaways
Several best practices follow the preceding definitions and roles:
- Don’t mix Science and Engineering teams or goals: Science, and the approach to answering questions using the scientific method, are different animals and require different skills and different expectations from engineering. The process of scientific discovery is difficult to apply time constraints; sometimes answers exist and are easy to find, sometimes they are hard to find, and sometimes they simply do not exist. If you want to have effective analytics efforts, choose people trained to approach your Big Data needs appropriately, and put them in an organization dedicated to “science” activities (discovery). They may need to be paired with programmers or engineers to accomplish their work – but they should not be confused with a typical product team. “Ah-Ha!” moments are something you must allocate time against – but not something for which you can expect an answer in a defined time interval.
- Find the right ratio of engineers to programmers: Most companies don’t need all their technical folks to be engineers. Frankly, we don’t “mint” enough engineers anyway – roughly 80K to 100K per year since 1945 with roughly 18K of those being Computer Science graduates who go on to become “engineers” in practice. Augment your teams with folks who have attended trade schools or specialized boot camps to learn how to program.
- Ensure you hire capable engineers: You should both pay more for and expect more from the folks on your team who are doing engineering related tasks. Do not allow them to approach solutions with just a “software” focus; expect them to understand how everything works together to ensure that you build the most reliable services possible.
May 2, 2018 | Posted By: AKF
Our post on the AKF Scale Cube made reference to a concept that we call “Fault Isolation” and sometimes “Swim lanes” or “Swim-laned Architectures”. We sometimes also call “swim lanes” fault isolation zones or fault isolated architecture.
Fault Isolation Defined
A “Swim lane” or fault isolation zone is a failure domain. A failure domain is a group of services within a boundary such that any failure within that boundary is contained within the boundary and the failure does not propagate or affect services outside of said boundary. Think of this as the “blast radius” of failure meant to answer the question of “What gets impacted should any service fail?” The benefit of fault isolation is twofold:
1) Fault Detection: Given a granular enough approach, the component of availability associated with the time to identify the failure is significantly reduced. This is because all effort to find the root cause or failed component is isolated to the section of the product or platform associated with the failure domain. Once something breaks, because the failure is limited in scope, it can be more rapidly identified and fixed. Recovery time objectives (RTO) are subsequently decreased which increases overall availability.
2) Fault Isolation: As stated previously, the failure does not propagate or cause a deterioration of other services within the platform. The “blast radius” of a failure is contained. As such, and depending upon approach, only a portion of users or a portion of functionality of the product is affected. This is akin to circuit breakers in your house - the breaker exists to limit the fault zone for any load that exceeds a limit imposed by the breaker. Failure propagation is contained by the breaker popping and other devices are not affected.
Architecting Fault Isolation
A fault isolated architecture is one in which each failure domain is completely isolated. We use the term “swim lanes” to depict the separations. In order to achieve this, ideally there are no synchronous calls between swim lanes or failure domains made pursuant to a user request. User initiated synchronous calls between failure domains are absolutely forbidden in this type of architecture as any user-initiated synchronous call between fault isolation zones, even with appropriate timeout and detection mechanisms, is very likely to cause a cascading series of failures across other domains. Strictly speaking, you do not have a failure domain if that domain is connected via a synchronous call to any other service in another domain, to any service outside of the domain, or if the domain receives synchronous calls from other domains or services. Again, “synchronous” is meant to identify a synchronous call (call, wait for a response) pursuant to any user request.
It is acceptable, but not advisable, to have asynchronouss calls between domains and to have non-user initiated synchronous calls between domains (as in the case of a batch job collecting data for the purposes of reporting in another failure domain). If such a communication is necessary it is very important to include failure detection and timeouts even with the asynchronous calls to ensure that retries do not call port overloads on any services. Here is an interesting blog post about runaway scripts and their impact on Apache, PHP, and MySQL.
As previously indicated, a swim lane should have all of its services located within the failure domain. For instance, if database [read/writes] are necessary, the database with all appropriate information for that swim lane should exist within the same failure domain as all of the application and webservers necessary to perform the function or functions of the swim lane. Furthermore, that database should not be used for other requests of service from other swim lanes. Our rule is one production database on one host.
The figure below demonstrates the components of software and infrastructure that are typically fault isolated:
Rarely are shared higher level network components isolated (e.g. border systems and core routers).
Sometimes, if practical, firewalls and load balancers are isolated. These are especially the case under very high demand situations where a single pair of devices simply wouldn’t meet the demand.
The remainder of solutions are always isolated, with web-servers, top of rack switches (in non IaaS implementations), compute (app servers) and storage all being properly isolated.
Applying Fault Isolation with AKF’s Scale Cube
As we have indicated with our Scale Cube in the past, there are many ways in which to think about swim laned architectures. Swim lanes can be isolated along the axes of the Scale Cube as shown below with AKF’s circuit breaker analogy to fault isolation.
Fault isolation in X-axis would mean replicating everything for high availability - and performing the replication asynchronously and in an eventually consistent (rather than a consistent) fashion. For example, when a data center fails the fault will be isolated to the one failed data center or multiple availability zones. This is common with traditional disaster recovery approaches, though we do not often advise it as there are better and more cost effective solutions for recovering from disaster.
Fault Isolation in the Y-axis can be thought in terms of a separation of services e.g. “login” and “shopping cart” (two separate swim lanes) each having the web and app servers as well as all data stores located within the swim lane and answering only to systems within that swim lane. Each portion of a page is delivered from a separate service reducing the blast radius of a potential fault to it’s swim lane.
While purposely not legible (fuzzy) the fake example above shows different components of a fictional business account from a fictional bank. Components of the page are separated with one component showing a summary, another component displaying more detailed information and still other components showing dynamic or static links - each derived from properly isolated services.
Another approach would be to perform a separation of your customer base or a separation of your order numbers or product catalog. Assuming an indiscriminate function to perform this separation (like a modulus of id), such a split would be a Z axis swim lane along customer, order number or product id lines. More beneficially, if we are interested in fastest possible response times to customers, we may split along geographic boundaries. We may have data centers (or IaaS regions) serving the West and East Coasts of the US respectively, the “Fly-Over States” of the US, and regions serving the EU, Canada, Asia, etc. Besides contributing to faster perceived customer response times, these implementations can also help ensure we are compliant with data sovereignty laws unique to different countries or even states within the US.
Combining the concepts of service and database separation into several fault isolative failure domains creates both a scalable and highly available platform. AKF has helped achieve a high availability through fault isolation. Contact us to see how we can help you achieve the same fault tolerance.
AKF Partners helps companies create highly available, fault isolated solutions. Send us a note - we’d love to help you!
May 2, 2018 | Posted By: Marty Abbott
The rate of involuntary turnover for the senior executive running technology and engineering in a company is unfortunately high, with most senior technology executives lasting less than four years in a job. Our experience over the last 12 years, and with nearly 500 companies, indicates that nearly 50% of senior technology executives are at-risk with their peers around the CEO’s management team table.
The reasons why CEOs and their teams are concerned about their senior technology executive vary, but the trigger causes of concern cluster around 5 primary themes:
- Lack of Business Acumen – Doing the Wrong Things
- Failure to Lead and Inspire
- Failure to Manage and Execute
- Trapped in Today – Failure to Plan for Tomorrow
- Lack of Technical Knowledge - Doing Things the Wrong Way
Before digging into each of these, I believe it is informative to first investigate the sources (backgrounds) of most senior technology executives.
CTO and CIO Backgrounds
Within our dataset, there are 2 primary backgrounds from which most senior technology executives come:
1. Raised through the Technical Ranks
These executives have spent a majority of their working career in some element of product engineering or IT. They are very often promoted based on technical acumen, and a perception of being able to “get things done”. Often, they are technically gifted folks with technical or engineering undergraduate degrees, and sometimes they have a great deal of project management experience.
2. Co-Opted from the Business
These executives have spent a majority of their working career in a function other than technology. They may have been raised through marketing, sales or finance and have demonstrated an affinity for technology and some high-level understanding of its application. They rarely have a deep technical understanding and have done very little hands-on work within technology or engineering.
We see both backgrounds struggling with the chief technology role. Sometimes they fail for similar reasons, but often for very different reasons within our five primary causes.
Lack of Business Acumen – Doing the Wrong Things
This is by far the most common reason for failure for chief technologists “Raised through the technical ranks”. On the flip side, it is rarely in our experience the reason why executives “Co-Opted from the Business” find themselves in trouble.
The technologist’s peers often complain that they do not trust the executive in question and often cite a failure to present opportunities, fixes, and projects in “business terms”. Put frankly, a lack of understanding of business in general, and an inability (or lack of desire to) justify actions in meaningful business terms causes a lack of trust with the CIO/CTO’s peers.
Further, CTOs sometimes overly focus on what is “cool” rather than the things that create significant business and stakeholder value. Business peers complain that money and headcount is being spent without a justifiable return. An oft heard quote is “We don’t get why there are so many people working on things that don’t drive revenue.”
The fix for this is easy. Executives raised through the technology ranks should seek education and training in the fundamentals and language of business. A great way to do this is for the tech exec to get an MBA or attend an abbreviated MBA-like program such as those offered by Harvard Business School or Stanford. Many business courses focused on cost justification and the business financial statements are also available from online programs.
Failure to Lead and Inspire
This failure is most common for executives “raised through the technical ranks”. Comments from peers and CEOs of the CTO indicate that he struggles with creating a strategy that clearly supports the needs of the business. Further, individual contributors within the organization appear to be disconnected from the mission of the business and disengaged at work. Often, employees within such an organization will complain that the CIO/CTO requires that all major decisions go through her. Such employees and organizations often test low on work engagement and overall morale.
The cause of this failure is again often the result of promoting a person solely upon her technical capabilities. While there are many great technology leaders, leadership and technical acumen have very little in common. Some folks can be trained to appreciate the value and need for a compelling vision and to be inspirational, but alas some cannot.
The fix is to attempt training, but where the executive does not show the willingness or capability, a replacement may be necessary. Sometimes we find that mentoring from a former or current successful CIO/CTO helps. Coaching from a professional coach or leadership professional may also be helpful.
Failure to Manage and Execute
This failure is common for both those raised through the technical ranks, and those co-opted from the business. At the very heart of this failure is a perceived lack of execution on the part of the executive – specifically in getting things done. Often the cause is a lack of communication. Sometimes the cause is poor project management capabilities within the organization or a lack of management acumen within the organization.
For executives co-opted from the business, we find that they sometimes struggle in communicating effectively with the technical members of their team and as such don’t have a clear understanding of status. They may also struggle with the right questions to ask to either probe for current status or to help keep their teams on track.
Where the problem is the lack of technical understanding by non-technical executives running tech functions, the fix is to augment them with strong technical people who also speak the language of business (similar to the lack of business acumen failure). For the other failures, the fix is to build appropriate project management and oversight into product delivery. This may be adding skills to the team, or it may mean needing to infuse an element of valuing execution into the technology organization.
Trapped in Today – Failure to Plan for Tomorrow
This type of failure is common to executives of both backgrounds. In fact, the problem is common to leaders in all positions requiring a mix of both operational focus and forward-looking strategy development. The needs of running the day to day business often biases the executive to what is necessary to ensure that you make a day, month or quarter at the expense of what needs to happen for the future. As such, fast moving secular trends (past examples of which are IaaS, Agile, NoSQL, mobile adoption, etc) get ignored today and become crises tomorrow.
There are many potential fixes for this, including ensuring that executives budget their time to create “space” for future planning. Organizationally, we may add someone in larger companies to focus on either operational needs or the needs of tomorrow. Additionally, we can bring in outside help and perspectives either in the form of new hires or consultants who have experience with emerging trends and technologies.
Lack of Technical Knowledge
This failure is owned almost entirely by folks with a non-technology background who find themselves running technical organizations. It manifests itself in multiple ways including not understanding what technologists are saying, not understanding the right questions to ask, and not fully understanding the ramification of various technical decisions. While a horrible position to be in, it is no more or less disastrous than lacking business acumen. In either scenario, we have an imperfect matching of an engine and the transmission necessary to gain benefit from that engine.
Unfortunately, this one is the hardest of all problems to fix. Whereas it is comparatively easy to learn to speak the language of business, the breadth and depth of understanding necessary to properly run a technology organization is not easily acquired. One need not be the best engineer to be successful, but one should certainly understand “how and why things work” to be able to properly evaluate risk and ask the right questions.
The fix is the mirror image of the fix for business acumen. No technology executive should be without deep technical understanding AND a clear understanding of business fundamentals.
Perhaps the most important point to learn from our experience in this area is to ensure that, regardless of your CTO/CIO’s background, you ensure he or she has both the technical and business skills necessary to do the job. Identify weaknesses through interviews and daily interactions and help the executive focus on shoring up these weaknesses through skill augmentation within their organization or through education, mentoring and coaching.
AKF Partners provides mentoring and coaching for both CTOs and CEOs to help ensure a successful partnership between these two key creators of company value. If your CEO or CTO has departed, we also provide interim leadership services.
April 27, 2018 | Posted By: Dave Swenson
Agile Software Development is a widely adopted methodology, and for good reason. When implemented properly, Agile can bring tremendous efficiencies, enabling your teams to move at their own pace, bringing your engineers closer to your customers, and delivering customer value
quicker with less risk. Yet, many companies fall short from realizing the full potential of Agile, treating it merely as a project management paradigm by picking and choosing a few Agile structural elements such as standups or retrospectives without actually changing the manner in which product delivery occurs. Managers in an Agile culture often forget that they are indeed still managers that need to measure and drive improvements across teams.
All too often, Agile is treated solely as an SDLC (Software Development Lifecycle), focused only upon the manner in which software is developed versus a PDLC (Product Development Lifecycle) that leads to incremental product discovery and spans the entire company, not just the Engineering department.
Here are the five most common Agile failures that we see with our clients:
- Technology Executives Abdicate Responsibility for their Team’s Effectiveness
Management in an Agile organization is certainly different than say a Waterfall-driven one. More autonomy is provided to Agile teams. Leadership within each team typically comes without a ‘Manager’ title. Often, this shift from a top-down, autocratic, “Do it this way” approach to a grass-roots, bottoms-up one sways way beyond desired autonomy towards anarchy, where teams have been given full freedom to pick their technologies, architecture, and even outcomes with no guardrails or constraints in place. See our Autonomy and Anarchy article for more on this.
Executives often become focused solely on the removal of barriers the team calls out, rather than leading teams towards desired outcomes. They forget that their primary role in the company isn’t to keep their teams happy and content, but instead to ensure their teams are effectively achieving desired business-related outcomes.
The Agile technology executive is still responsible for their teams’ effectiveness in reaching specified outcomes (e.g.: achieve 2% lift in metric Y). She can allow a team to determine how they feel best to reach the outcome, within shared standards (e.g.: unit tests must be created, code reviews are required). She can encourage teams to experiment with new technologies on a limited basis, then apply those learnings or best practices across all teams. She must be able to compare the productivity and efficiencies from one team to another, ensuring all teams are reaching their full potential.
- No Metrics Are Used
The age-old saying “If you can’t measure it, you can’t improve it” still applies in an Agile organization. Yet, frequently Agile teams drop this basic tenet, perhaps believing that teams are self-aware and critical enough to know where improvements are required. Unfortunately, even the most transparent and aware individuals are biased, fall back on subjective characteristics (“The team is really working hard”), and need the grounding that quantifiable metrics provide. We are continually surprised at how many companies aren’t even measuring velocity, not necessarily to compare one team with another, but to compare a team’s sprint output vs. their prior ones. Other metrics still applicable in an Agile world include quality, estimation accuracy, predictability, percent of time spent coding, the ratio of enhancements vs. maintenance vs. tech debt paydown.
These metrics, their definitions and the means of measuring them should be standardized across the organization, with regular focus on results vs. desired goals. They should be designed to reveal structural hazards that are impeding team performance as well as best practices that should be adopted by all teams.
- Your Velocity is a Lie
Is your definition of velocity an honest one? Does it truly measure outcomes, or only effort? Are you consistent with your definition of ‘done’? Take a good look at how your teams are defining and measuring velocity. Is velocity only counted for true ‘ready to release’ tasks? If QA hasn’t been completed within a sprint, are the associated velocity points still counted or deferred?
Velocity should not be a measurement of how hard your teams are working, but instead an indicator of whether outcomes (again, e.g.: achieve 2% lift in metric Y) are likely to be realized - take credit for completion only when in the hands of customers.
- Failure to Leverage Agile for Product Discovery
From the Agile manifesto: “Our highest priority is to satisfy the customer through early and continuous delivery of valuable software”. Many companies work hard to get an Agile structure and its artifacts in place, but ignore the biggest benefit Agile can bring: iterative and continuous product discovery. Don’t break down a six-month waterfall project plan into two week sprints with standups and velocity measurements and declare Agile victory.
Work to create and deliver MVPs to your customers that allow you to test expected value and customer satisfaction without huge investment.
- Treating Agile as an SDLC vs. a PDLC
As explained in our article PDLC or SDLC, SDLC (Software Development Lifecycle) lives within PDLC (Product Development Lifecycle). Again, Agile should not be treated as a project management methodology, nor as a means of developing software. It should focus on your product, and hopefully the related customer success your product provides them. This means that Agile should permeate well beyond your developers, and include product and business personnel.
Business owners or their delegates (product owners) must be involved at every step of the PDLC process. PO’s need to be embedded within each Agile team, ideally colocated alongside team members. In order to provide product focus, POs should first bring to the team the targeted customer problem to be solved, rather than dictating only a solution, then work together with the team to implement the most effective solution to that problem.
AKF Partners helps companies transition to Agile as well as fine-tune their existing Agile processes. We can readily assess your PDLC, organization structure, metrics and personnel to provide a roadmap for you to reach the full value and benefits Agile can provide. Contact us to discuss how we can help.
April 25, 2018 | Posted By: Robin McGlothin
The Scale Cube is a model for defining microservices and scaling technology products. AKF Partners invented the Scale Cube in 2007, publishing it online in our blog in 2007 (original article here) and subsequently in our first book the Art of Scalability and our second book Scalability Rules.
The Scale Cube (sometimes known as the “AKF Scale Cube” or “AKF Cube”) is comprised of an 3 axes:
• X-Axis: Horizontal Duplication and Cloning of services and data
• Y-Axis: Functional Decomposition and Segmentation - Microservices (or micro-services)
• Z-Axis: Service and Data Partitioning along Customer Boundaries - Shards/Pods
These axes and their meanings are depicted below in Figure 1.
The Scale Cube helps teams keep critical dimensions of system scale in mind when solutions are designed and when existing systems are being improved.
Figure 2, below, displays how the cube may be deployed in a modern architecture decomposing services (sometimes called microservices architecture), cloning services and data sources and segmenting similar objects like customers into “pods”.
Scaling with the X Axis of the Scale Cube
The most commonly used approach of scaling an solution is by running multiple identical copies of the application behind a load balancer also known as X-axis scaling. That’s a great way of improving the capacity and the availability of an application.
When using X-axis scaling each server runs an identical copy of the service (if disaggregated) or monolith. One benefit of the X axis is that it is typically intellectually easy to implement and it scales well from a transaction perspective. Impediments to implementing the X axis include heavy session related information which is often difficult to distribute or requires persistence to servers – both of which can cause availability and scalability problems. Comparative drawbacks to the X axis is that while intellectually easy to implement, data sets have to be replicated in their entirety which increases operational costs. Further, caching tends to degrade at many levels as the size of data increases with transaction volumes. Finally, the X axis doesn’t engender higher levels of organizational scale.
Figure 3 explains the pros and cons of X axis scalability, and walks through a traditional 3 tier architecture to explain how it is implemented.
Scaling with the Y Axis of the Scale Cube
Y-axis scaling (think services oriented architecture, microservices or functional decomposition of a monolith) focuses on separating services and data along noun or verb boundaries. These splits are “dissimilar” from each other. Examples in commerce solutions may be splitting search from browse, checkout from add-to-cart, login from account status, etc. In implementing splits, Y-axis scaling splits a monolithic application into a set of services. Each service implements a set of related functionalities such as order management, customer management, inventory, etc. Further, each service should have its own, non-shared data to ensure high availability and fault isolation. Y axis scaling shares the benefit of increasing transaction scalability with all the axes of the cube.
Further, because the Y axis allows segmentation of teams and ownership of code and data, organizational scalability is increased. Cache hit ratios should increase as data and the services are appropriately segmented and similarly sized memory spaces can be allocated to smaller data sets accessed by comparatively fewer transactions. Operational cost often is reduced as systems can be sized down to commodity servers or smaller IaaS instances can be employed.
Figure 4 explains the pros and cons of Y axis scalability and shows a fault-isolated example of services each of which has its own data store for the purposes of fault-isolation.
Scaling with the Z Axis of the Scale Cube
Whereas the Y axis addresses the splitting of dissimilar things (often along noun or verb boundaries), the Z-axis addresses segmentation of “similar” things. Examples may include splitting customers along an unbiased modulus of customer_id, or along a somewhat biased (but beneficial for response time) geographic boundary. Product catalogs may be split by SKU, and content may be split by content_id. Z-axis scaling, like all of the axes, improves the solution’s transactional scalability and if fault isolated it’s availability. Because the software deployed to servers is essentially the same in each Z axis shard (but the data is distinct) there is no increase in organizational scalability. Cache hit rates often go up with smaller data sets, and operational costs generally go down as commodity servers or smaller IaaS instances can be used.
Figure 5 explains the pros and cons of Z axis scalability and displays a fault-isolated pod structure with 2 unique customer pods in the US, and 2 within the EU. Note, that an additional benefit of Z axis scale is the ability to segment pods to be consistent with local privacy laws such as the EU’s GDPR.
Like Goldilocks and the three bears, the goal of decomposition is not to have services that are too small, or services that are too large but to ensure that the system is “just right” along the dimensions of scale, cost, availability, time to market and response times.
AKF Partners has helped hundreds of companies, big and small, employ the AKF Scale Cube to scale their technology product solutions. We developed the cube in 2007 to help clients scale their products and have been using it since to help some of the greatest online brands of all time thrive and succeed. For those interested in “time travel”, here are the original 2 posts on the cube from 2007: Application Cube, Database Cube
April 24, 2018 | Posted By: Marty Abbott
We can tell a lot about a company within the first hour or so of any discussion. Consider the following statement fragments:
“We have a lot of smart people here…”
“We have some of the hardest working engineers…”
Contrast these with the following statement fragments:
“We measure ourselves against very specific business KPIs…”
“We win or lose daily based on how effectively we meet our customer expectations…”
There is a meaningful difference in the impact these two groups of statements have on a company’s culture and how that culture enables or stands in the way of success. The first group of statements are associated with independent variables (inputs) that will rarely in isolation result in desired outcomes (dependent variables). When used in daily discourse, they reinforce the notion that something other than outcomes are the things upon which a company prides itself. Our experience is that these statements create an environment of hubris that often runs perpendicular to, and at best in no way reinforces, the achievement of results. Put another way, when we hear statements like this, we expect to find many operational problems.
The second group of statements are focused on meaningful and measurable outcomes. The companies with which we’ve worked that frequently communicate with these statements are among the most successful we’ve seen. Even when these companies struggle, their effort and focus is solidly behind the things that matter – those things that create value for the broadest swath of stakeholders possible.
The point here is that how we focus communication inside our companies has an important impact on the outcomes we achieve.
Success is often a result of several independent variables aligning to achieve a desired outcome. These may include, as Jim Collins points out, being in the right place at the right time – sometimes called “luck”. Further, there is rarely a single guaranteed path to success; multiple paths may result in varying levels of the desired outcome. Great companies and great leaders realize this, and rather than focusing a culture on independent variables they focus teams on outcomes. By focusing on outcomes, leaders are free to attempt multiple approaches and to tweak a variety of independent variables to find the most expedient path to success. We created the AKF Equation (we sometimes refer to it as the AKF Law) to help focus our clients on outcomes first:
Two very important corollaries follow from this equation or “law”:
Examples of Why Results, and Not Paths Matter
Intelligence Does Not Equal Success
As an example of why the dependent variable of results and not an independent variable like intelligence is most important consider Duckworth and Seligman’s research. Duckworth and Seligman and associates (insert link) conducted a review of GPA performance in adolescents. They expected to find that intelligence was the best indication of GPA. Instead, they found that self-discipline was a better indication of the best GPAs:
Lewis Terman, a mid-20th century Stanford Pyschology professor hypothesized that IQ was highly correlated with success in his famous termite study of 1500 students with an average IQ of 151. Follow on analysis and study indicated that while these students were successful, they were half as successful as a group of other students with a lower IQ.
Chris Langan, the world’s self-proclaimed “most intelligent man” with an IQ of 195 can’t seem to keep a job according to Malcolm Gladwell. He’s been a cowboy, a stripper, a day laborer and has competed on various game shows.
While we’d all like to have folks of average or better intelligence on our team, the above clearly indicates that it’s more important to focus on outcomes than an independent variable like intelligence.
Drive Does Not Equal Success
While most successful companies in the Silicon Valley started with employees that worked around the clock, The Valley is also littered with the corpses of companies that worked their employees to the bone. Just ask former employees of the failed social networking company Friendster. Hard work alone does not guarantee success. In fact, most “overnight” success appears to take about 10,000 hours of practice just to be good, and 10 years of serious work to be truly successful (according to both Ramit Sethi and Malcolm Gladwell.)
Hard work, if applied in the right direction and to the right activities should of course help achieve results and success. But again, it’s the outcome (results and success) that matter.
Wisdom Does Not Equal Success
Touting the age and experience of your management team? Think again. There’s plenty of evidence that when it comes to innovative new approaches, we start to “lose our touch” at the age of 40. The largest market cap technology companies of our age were founded by “youngsters” – Bezos being the oldest of them at the age of 30. Einstein posited all of his most significant theories well before the age of 40 – most of them in the “miracle year” at the age of 26 in 1905. The eldest of the two Wright Brothers was 39.
While there are no doubt examples of successful innovation coming after the age of 40, and while some of the best managers and leaders we know are over the age of 40, wisdom alone is not a guarantee for success.
Only Success = Success
Most of the truly successful and fastest growing companies we know focus on a handful of dependent variables that clearly identify results, progress and ultimately success. Their daily manners and daily discourse are carefully formulated around evaluation of these success criteria. Even these company’s interaction with outside firms focuses on data driven indications of success time and time again – not on independent variables such as intelligence, work ethic, wisdom (or managerial experience), etc.
These companies identify key performance indicators for everything that they do, believing that anything worth doing should have a measurable value creation performance indicator associated with it. They maniacally focus on the trends of these performance indicators, identifying significant deviations (positive or negative) in order to learn and repeat the positive causes and avoid those that result in negative trends. Very often these companies employ agile planning and focus processes similar to the OKR process.
The most successful companies rarely engage in discussions around spurious relationships such as intelligence to business success, management experience to business success, or effort to business success. They recognize that while some of these things are likely valuable in the right proportions, they are rarely (read never) the primary cause of success.
AKF Partners helps companies develop highly available, scalable, cost effective and fast time-to-market products. We know that to be successful in these endeavors, companies must have the right people, in the right roles, with the right behaviors - all supported by the right culture. Contact us to discuss how we can help with your product needs.
April 23, 2018 | Posted By: AKF
The Relative Risk Equation
Technologists are frequently asked: what are the chances that a given software release is going to work? Do we understand the risk that each component or new feature brings to the entire release?
In this case, measuring risk is assessing the probability that a component will perform poorly or even fail. The higher the probability of failure, the higher the risk. Probabilities are just numbers, so ideally, we should be able to calculate the risk (probability of failure) of the entire release by aggregating the individual risk of each component. In reality this calculation works quite well.
Putting a number to risk is a very useful tool and this article will provide a simple and easy way to calculate risk and produce a numeric result which can be used to compare risk across a spectrum of technology changes.
When we assess a system, one of the key characteristics we want to benchmark is the probability that a system will fail. In particular, if we want to understand whether or not a system can support an availability goal of 99.95% we have to do some analysis to see if the probability that a failure occurs is lower than 0.05%. How do we calculate this?
First let’s introduce some vocabulary.
DEFINITION: Pi is the probability that a given system will experience an incident, i.
For the purposes of this article we are measuring relative and not absolute values. A system where Pi=1 means the system is very unlikely to experience failure. On other hand, Pi values approaching 10 indicate a system with a 100% probability of failure.
DEFINITION: Ii is the impact (or blast radius) a system failure will have.
Ii=1 indicated no impact where Ii=10 indicates a complete failure of an entire system.
DEFINITION: Pd is the probability that an incident will be detected.
Pd=1 means an incident will be completely undetected and Pd=10 indicates that a failure will be completely detected 100% of the time.
Measuring across a scale from 1 to 10 is often too granular; we can reduce scale to tee-shirt sizes and replace 1, 2, …, 10 with Small (3), Medium (5), Large (7). Any series of values will work so long as we are consistent in our approach.
Relative Risk is now only a question of math:
Ri = (Pi x Ii ) / Pd
With values of 1, 2, ..., 10, the minimum Relative Risk value is 0.1 (effectively 0 relative risk) and the maximum value is 100. With tee-shirt sizes, the minimum Relative Risk value is 6/7 and the maximum value is 16.333. Basic statistics can help us to standardize values from 1 to 10:
std(Ri) = (Ri - Min(Ri)) / (Max(Ri) - Min(Ri)) x 10
where Max(Ri) = 16 1/3 and Min(Ri) = 1 2/7 (in the case of tee-shirt sizing)
Example 1: Adding a new data file to a relational database
- Pi = 3 (low.). It’s unlikely that adding a data file will cause a system failure, unless we’re already out of space.
- Ii = 5 (medium.). A failure to add a datafile indicates a larger storage issue may exist which would be very impactful for this database instance. However as there is a backup (for this example), the risk is lowered.
- Pd = 7 (high). It is virtually certain that any failure would be noticed immediately.</li>
- Therefore Ri = (3 X 5) / 7 = 2.1. Standardizing this to our 1 to 10 scale produces a value of 0.51. That is a very low number, so adding datafiles is relatively low risk procedure.
Example 2: Database backups have been stored on tapes that have been demagnetized during transportation to offsite storage.
- Pi = 5 (medium.). While restoring a backup is a relatively safe event, on a production system it is likely happening during a time of maximum stress.
- Ii = 7 (high.). When we attempt to restore the backup tape, it will fail.
- Pd= 3 (low.) The demagnetization of the tapes was a silent and undetected failure.
- Therefore Ri = (5 X 7) / 3 or 11.7. We arrive at a value of 7 on the standard scale, which is quite high, so we should consider randomly testing tapes from off-site storage.
This formula has utility across a vast spectrum of technology:
- Calculate a relative risk value for each feature in a software release, then take the total value of all features in order to compare risk of a release against other releases and consider more detailed testing for higher relative risk values.
- During security risk analysis, calculate a relative risk value for each threat vector and sort the resulting values. The result is a prioritized list of steps required to improve security based on the probabilistic likelihood that a threat vector will cause real damage.
- During feature planning and prioritization exercises, this formula can be altered to calculate feature risk. For example, Pi can mean confidence in estimate, Ii can be converted to impact of feature (e.g. higher revenue = higher impact) and Pd is perceived risk of the feature. Putting all features through this calculation then sorting from high values to low values yields a list of features ranked by value and risk.
The purpose of this formula and similar methods is not to produce a mathematically absolute estimate of risk. The real value here is to remove guessing and emotion from the process of evaluating risk and providing a framework to compare risk across a variety of changes.
Click here to see how AKF Partners can help you manage risk and other technology issues.
‹ First < 5 6 7 8 9 > Last ›