AKF Partners

Abbott, Keeven & Fisher Partners Partners in Technology

Growth Blog

Scalability and Technology Consulting Advice for SaaS and Technology Companies

The Problem with Non-Functional Requirements

May 4, 2018  |  Posted By: Marty Abbott

Many of our clients use the term “Non-Functional Requirements” to group into a basket those portions of their solution that don’t easily fit into method or function-based execution of market needs.  Examples of non-functional requirements often include things like “Availability”, “Scalability”, “Response Time”, “Data Sovereignty” (as codified within requirements such as the GDPR), etc.  Very often, these “NFRs” are relegated to second class citizens within the development lifecycle and, if lucky, end up as a technical debt item to be worked on later.  More often than not, they are just forgotten until major disaster strikes.  This happens so often that we at AKF joke that “NFR” really stands for “No F-ing Resources (available to do the job)”.

While I believe that this relegation to second class citizen is a violation of fiduciary responsibility, I completely understand how we’ve collectively gotten away with it for so long.  For most of the history of our products, we’ve produced solutions for which customers are responsible for running.  We built the code, shipped it, and customers installed it and ran it on their systems. 

Fortunately (for most of us) the world has changed to the SaaS model.  As subscribers, we no longer bear the risk of running our own systems.  Implementation is easier and faster, costs of running solutions lower. 

Unfortunately (for most of us) these changes also mean we are now wholly accountable for NFRs.  We now mostly produce services, for which we are wholly accountable for the complete outcomes including how our solution runs.

Most NFRS Are Table Stakes and Must-Haves

In the world of delivering services, most NFR capabilities are must-haves.  SaaS companies provide a utility, and the customer expectation is that utility will be available whenever the customer needs it.  You simply do not have an option to decide to punt attributes like availability, or regulatory compliance to a later date. 

The Absolute Value of NFR Your Product/Service Needs Varies

While we believe most NFRs are necessary, and non-negotiable for playing in the SaaS space, the amount that you need of each of them varies with the portion of the market you are addressing within the Technology Adoption Lifecycle.

As you progress from left to right into a market, the NFR expectations of the adopters of your solution increase.  Innovators care more about the differentiating capability that you offer than they do the availability of your solution.  That said, they still need to be able to use your product and will stop using it or churn if it doesn’t meet their availability, response time, data sovereignty, and privacy needs.  NFRs are still necessary – they just matter less to innovators than later adopters.

At the other end of the extreme, Late Majority and Laggard adopters care greatly about NFRs.  Whereas Innovators may be willing to grudgingly live with 99.8% availability, the Late Majority will settle for nothing less than 99.95% or better.

It’s Time to Eliminate the Phrase Non-Functional Requirement

We believe that even the name “Non-Functional Requirement” somehow implies that necessary capabilities like availability and data sovereignty can somehow take a back seat to other activities within the solutions we create.  At the very least, the term fails to denote the necessity (some of them legally so) of these attributes.
We prefer names like “Table Stakes” or “Must Have Requirements” to NFRs.
 
It’s Also Time to Eliminate the Primary Cause

While we sometimes find that teams simply haven’t changed their mindset to properly understand that Table Stakes aren’t optional investments, we more often find a more insidious cause:  Moral Hazards.  Moral hazards exist when one person makes a decision for which another must bear the cost:  Person A decides to smoke, but Person B bears the risk of cancer instead of Person A.

Commonly, we see product managers with ownership over product decisions, but no accountability for Table Stakes like availability, response time, cost effectiveness, security, etc.  The problem with this is that, as we’ve described, the Table Stakes are the foundation of the Maslow’s Needs for online products.  Engineering teams and product teams should jointly own all the attributes of the products they co-create.  Doing so will help fix the flawed notion that Table Stakes can be deferred.

AKF Partners helps clients build highly available, scalable, fast response time solutions that meet the needs of the portion of the Technology Adoption Lifecycle they are addressing.

Permalink

The Difference between Science, Engineering and Programmers and What it Means to You

May 3, 2018  |  Posted By: Marty Abbott

Scientists, Engineers, and Technicians

What is, or perhaps should be, the difference between a Computer Scientist, Data Scientist, Software Engineer, and Programmer?  What should your expectations be of each?  How should they work together?

To answer these questions, we’ll look at the differences between scientists, engineers, and technicians in more mature disciplines, apply them to our domain, and offer suggestions as to expectations.

Science and Scientists

The primary purpose of science is “to know”.  Knowing, or the creation of knowledge, is enabled through discovery and the practice of the scientific method.  Scientists seek to know “why” something is and “how” that something works.  Once this understanding of “why and how” are generally accepted, ideally they are codified within theories that are continually tested for validity over time.

To be successful, scientists must practice both induction and deduction.  Induction seeks to find relationships for the purposes of forming hypotheses.  Deduction seeks to test those hypotheses for validity.

In the physical world (or in physical, non-biological, and non-behavioral sciences), scientists are most often physicists and chemists.  They create the knowledge, relationships, and theories upon which aerospace, chemical, civil, electrical, mechanical, and nuclear engineers rely.

For our domain, scientists are mathematicians, computer scientists – and more recently – data scientists.  Each seeks to find relationships and approaches useful for engineers and technicians to apply for business purposes.  The scientist “finds”, the engineer “applies”.  Perhaps the most interesting new field here is that of data science.  Whereas most science is focused on broad discovery, data science is useful within a business as it is meant to find relationships between various independent variables and desirable outcomes or dependent variables.  These may be for the purposes of general business insights, or something specific like understanding what items (or SKUs) we should display to different individuals to influence purchase decisions.

True scientists tend to have doctorates, the doctoral degree being the “certification” that one knows how to properly perform research.  There are of course many examples of scientists without doctoral degrees – but these days that is rare in any case other than data scientists (here the stakes are typically lower).

Engineering and Engineers
The primary purpose of engineering is to “create” or to “do”.  Engineers start with an understanding (created by scientists) of “why” things work, and “how” they work (scientific theories – often incorrectly called “laws” by engineers) and apply them for the purposes of creating complex solutions or products.  Mechanical engineers rely on classical physics, electrical engineers rely on modern physics.  Understanding both the “why” and “how” are important to be able to create highly reliable solutions, especially in new or unique situations.  For instance, it is important for electrical engineers to understand field generation for micro-circuitry and how those fields will affect the operation of the device in question.  Civil and mechanical engineers need to understand the notion of harmonic resonance in order to avoid disasters like the Tacoma Narrows bridge failure.

The domain of “software engineering” is much more confusing.  Unlike traditional engineering domains, software engineering is ill-defined and suffers from an overuse of the term in practice.  If such a domain truly existed, it should follow the models of other engineering disciplines.  As such, we should expect that software engineers have a deep understanding of the “whys and hows” derived from the appropriate sciences: computer science and mathematics.  For instance, computer scientists would identify and derive interesting algorithms and suggest applications, whereas engineers would determine how and when to apply the algorithms for maximum effect.  Computer scientists identify unique scenarios (e.g. the dining philosophers problem) that broadly define a class of problems (mutual exclusion in concurrency) and suggest approaches to fix them.  Engineers put those approaches into practice with the constraints of the system they are developing in mind.

Following this logic, we should also expect our engineers to understand how the systems upon which they apply their trade (computers) truly function – the relationship between processors, memory, storage, etc.  Further, they should understand how an operating system works, how interpreters and compilers work, and how networks work.  They should be able to apply all these things to come up with simple designs that limit the blast radius of failure, ensure low latency response, and are cost effective to produce and maintain.  They should understand the depth of components necessary to deliver a service to a customer – not just the solitary component upon which they work (the code itself). 

Very often, to be a successful engineer, one must have at least a bachelor’s degree in an engineering domain.  The degree at least indicates that one has proven to some certifying body that they understand the math, the theories, and the application of those theories to the domain in question.  Interestingly, as a further example of the difference between engineering and science, some countries issue engineering degrees under the degree of “Bachelors of Applied Science”.

There are of course many famous examples of engineers without degrees – for instance, the Wright Brothers.  The true test isn’t the degree itself, but whether someone understands the depth and breadth of math and science necessary to apply science in highly complex situations.  Degrees are simply one way of ensuring an individual has at least once proven an understanding of the domain.

Practical Applications and Technicians

Not everything we produce requires an engineer’s deep understanding of both why and how something works.  Sometimes, the application of a high-level understanding of “how” is sufficient.  As such, in many domains technicians augment engineers.  These technicians are responsible for creating solutions out of reusable building blocks and a basic understanding of how things work.  Electricians for instance are technicians within the domain of electrical engineering.  Plumbers are a very specific application of civil engineering.  HVAC technicians apply fluid mechanics from mechanical engineering.

These trades take a very technical skill, implementation specific tradecraft, and a set of heuristics to design, implement, and troubleshoot systems that are created time and time again.  Electricians, for instance, design the power infrastructure of homes and offices – potentially reviewed by an electrical engineer.  They then implement the design (wire the building) and are also responsible for troubleshooting any flaws.  The same is true for HVAC technicians and plumbers.

Programmers are the technicians for software engineers (engineering domain) and computer and data scientists (science domain).  Not everything we develop needs a “true engineer”.  Very often, as is the case with wiring a house, the solution is straight forward and can be accomplished with the toolset one gains with several weeks of training.  The key difference between an engineer and a programmer is again the depth and breadth of knowledge. 

Technicians are either trained through apprenticeship or through trade schools.  Electricians can come from either approach.  Broadly speaking, it makes sense to use technicians over using an engineer for at least 2 reasons:

  1. Some things don’t require an engineer, and the cost of an engineer makes no sense for these applications.
  2. The supply of engineers is very low relative to demand within the US across all domains.  During the great recession, engineers were one of the only disciplines at or near economic full employment – a clear indication of the supply/demand imbalance.

Implications and Takeaways

Several best practices follow the preceding definitions and roles:

  1. Don’t mix Science and Engineering teams or goals:  Science, and the approach to answering questions using the scientific method, are different animals and require different skills and different expectations from engineering.  The process of scientific discovery is difficult to apply time constraints; sometimes answers exist and are easy to find, sometimes they are hard to find, and sometimes they simply do not exist.  If you want to have effective analytics efforts, choose people trained to approach your Big Data needs appropriately, and put them in an organization dedicated to “science” activities (discovery).  They may need to be paired with programmers or engineers to accomplish their work – but they should not be confused with a typical product team.  “Ah-Ha!” moments are something you must allocate time against – but not something for which you can expect an answer in a defined time interval.
  2. Find the right ratio of engineers to programmers:  Most companies don’t need all their technical folks to be engineers.  Frankly, we don’t “mint” enough engineers anyway – roughly 80K to 100K per year since 1945 with roughly 18K of those being Computer Science graduates who go on to become “engineers” in practice.  Augment your teams with folks who have attended trade schools or specialized boot camps to learn how to program.
  3. Ensure you hire capable engineers:  You should both pay more for and expect more from the folks on your team who are doing engineering related tasks.  Do not allow them to approach solutions with just a “software” focus; expect them to understand how everything works together to ensure that you build the most reliable services possible.

 

Permalink

Fault Isolation in Services Architectures

May 2, 2018  |  Posted By: AKF

Our post on the AKF Scale Cube made reference to a concept that we call “Fault Isolation” and sometimes “Swim lanes” or “Swim-laned Architectures”.  We sometimes also call “swim lanes” fault isolation zones or fault isolated architecture.


Fault Isolation Defined
A “Swim lane” or fault isolation zone is a failure domain.  A failure domain is a group of services within a boundary such that any failure within that boundary is contained within the boundary and the failure does not propagate or affect services outside of said boundary.  Think of this as the “blast radius” of failure meant to answer the question of “What gets impacted should any service fail?” The benefit of fault isolation is twofold:

1) Fault Detection: Given a granular enough approach, the component of availability associated with the time to identify the failure is significantly reduced.  This is because all effort to find the root cause or failed component is isolated to the section of the product or platform associated with the failure domain.  Once something breaks, because the failure is limited in scope, it can be more rapidly identified and fixed.  Recovery time objectives (RTO) are subsequently decreased which increases overall availability.

2) Fault Isolation: As stated previously, the failure does not propagate or cause a deterioration of other services within the platform.  The “blast radius” of a failure is contained.  As such, and depending upon approach, only a portion of users or a portion of functionality of the product is affected.  This is akin to circuit breakers in your house - the breaker exists to limit the fault zone for any load that exceeds a limit imposed by the breaker.  Failure propagation is contained by the breaker popping and other devices are not affected. 

Architecting Fault Isolation
A fault isolated architecture is one in which each failure domain is completely isolated.  We use the term “swim lanes” to depict the separations. In order to achieve this, ideally there are no synchronous calls between swim lanes or failure domains made pursuant to a user request.  User initiated synchronous calls between failure domains are absolutely forbidden in this type of architecture as any user-initiated synchronous call between fault isolation zones, even with appropriate timeout and detection mechanisms, is very likely to cause a cascading series of failures across other domains.  Strictly speaking, you do not have a failure domain if that domain is connected via a synchronous call to any other service in another domain, to any service outside of the domain, or if the domain receives synchronous calls from other domains or services.  Again, “synchronous” is meant to identify a synchronous call (call, wait for a response) pursuant to any user request.

It is acceptable, but not advisable, to have asynchronouss calls between domains and to have non-user initiated synchronous calls between domains (as in the case of a batch job collecting data for the purposes of reporting in another failure domain).  If such a communication is necessary it is very important to include failure detection and timeouts even with the asynchronous calls to ensure that retries do not call port overloads on any services. Here is an interesting blog post about runaway scripts and their impact on Apache, PHP, and MySQL.

As previously indicated, a swim lane should have all of its services located within the failure domain.  For instance, if database [read/writes] are necessary, the database with all appropriate information for that swim lane should exist within the same failure domain as all of the application and webservers necessary to perform the function or functions of the swim lane.  Furthermore, that database should not be used for other requests of service from other swim lanes.  Our rule is one production database on one host.

The figure below demonstrates the components of software and infrastructure that are typically fault isolated:
Fault Isolation in Micro-Services Architectures

Rarely are shared higher level network components isolated (e.g. border systems and core routers).
Sometimes, if practical, firewalls and load balancers are isolated.  These are especially the case under very high demand situations where a single pair of devices simply wouldn’t meet the demand.

The remainder of solutions are always isolated, with web-servers, top of rack switches (in non IaaS implementations), compute (app servers) and storage all being properly isolated.

Applying Fault Isolation with AKF’s Scale Cube
As we have indicated with our Scale Cube in the past, there are many ways in which to think about swim laned architectures.  Swim lanes can be isolated along the axes of the Scale Cube as shown below with AKF’s circuit breaker analogy to fault isolation. 

AKF Fault Isolation in the X-axis
Fault isolation in X-axis would mean replicating everything for high availability - and performing the replication asynchronously and in an eventually consistent (rather than a consistent) fashion.  For example, when a data center fails the fault will be isolated to the one failed data center or multiple availability zones. This is common with traditional disaster recovery approaches, though we do not often advise it as there are better and more cost effective solutions for recovering from disaster.

AKF Fault Isolation in the Y-axis
Fault Isolation in the Y-axis can be thought in terms of a separation of services e.g. “login” and “shopping cart” (two separate swim lanes) each having the web and app servers as well as all data stores located within the swim lane and answering only to systems within that swim lane.  Each portion of a page is delivered from a separate service reducing the blast radius of a potential fault to it’s swim lane. 

While purposely not legible (fuzzy) the fake example above shows different components of a fictional business account from a fictional bank.  Components of the page are separated with one component showing a summary, another component displaying more detailed information and still other components showing dynamic or static links - each derived from properly isolated services.

AKF Fault Isolation in the Z-axis
Another approach would be to perform a separation of your customer base or a separation of your order numbers or product catalog.  Assuming an indiscriminate function to perform this separation (like a modulus of id), such a split would be a Z axis swim lane along customer, order number or product id lines.  More beneficially, if we are interested in fastest possible response times to customers, we may split along geographic boundaries.  We may have data centers (or IaaS regions) serving the West and East Coasts of the US respectively, the “Fly-Over States” of the US, and regions serving the EU, Canada, Asia, etc.  Besides contributing to faster perceived customer response times, these implementations can also help ensure we are compliant with data sovereignty laws unique to different countries or even states within the US.


Combining the concepts of service and database separation into several fault isolative failure domains creates both a scalable and highly available platform.  AKF has helped achieve a high availability through fault isolation.  Contact us to see how we can help you achieve the same fault tolerance.

AKF Partners helps companies create highly available, fault isolated solutions.  Send us a note - we’d love to help you!

Permalink

Why CTOs Fail and What CEOs and CTOs Can Do About It

May 2, 2018  |  Posted By: Marty Abbott

The rate of involuntary turnover for the senior executive running technology and engineering in a company is unfortunately high, with most senior technology executives lasting less than four years in a job.  Our experience over the last 12 years, and with nearly 500 companies, indicates that nearly 50% of senior technology executives are at-risk with their peers around the CEO’s management team table.

The reasons why CEOs and their teams are concerned about their senior technology executive vary, but the trigger causes of concern cluster around 5 primary themes:

  1. Lack of Business Acumen – Doing the Wrong Things
  2. Failure to Lead and Inspire
  3. Failure to Manage and Execute
  4. Trapped in Today – Failure to Plan for Tomorrow
  5. Lack of Technical Knowledge - Doing Things the Wrong Way

Before digging into each of these, I believe it is informative to first investigate the sources (backgrounds) of most senior technology executives.

CTO and CIO Backgrounds

Within our dataset, there are 2 primary backgrounds from which most senior technology executives come:

    1.    Raised through the Technical Ranks

These executives have spent a majority of their working career in some element of product engineering or IT.  They are very often promoted based on technical acumen, and a perception of being able to “get things done”.  Often, they are technically gifted folks with technical or engineering undergraduate degrees, and sometimes they have a great deal of project management experience.

    2.    Co-Opted from the Business

These executives have spent a majority of their working career in a function other than technology.  They may have been raised through marketing, sales or finance and have demonstrated an affinity for technology and some high-level understanding of its application.  They rarely have a deep technical understanding and have done very little hands-on work within technology or engineering.

We see both backgrounds struggling with the chief technology role. Sometimes they fail for similar reasons, but often for very different reasons within our five primary causes.
 
Lack of Business Acumen – Doing the Wrong Things

This is by far the most common reason for failure for chief technologists “Raised through the technical ranks”.  On the flip side, it is rarely in our experience the reason why executives “Co-Opted from the Business” find themselves in trouble. 

The technologist’s peers often complain that they do not trust the executive in question and often cite a failure to present opportunities, fixes, and projects in “business terms”.  Put frankly, a lack of understanding of business in general, and an inability (or lack of desire to) justify actions in meaningful business terms causes a lack of trust with the CIO/CTO’s peers. 

Further, CTOs sometimes overly focus on what is “cool” rather than the things that create significant business and stakeholder value.  Business peers complain that money and headcount is being spent without a justifiable return.  An oft heard quote is “We don’t get why there are so many people working on things that don’t drive revenue.”

The fix for this is easy.  Executives raised through the technology ranks should seek education and training in the fundamentals and language of business.  A great way to do this is for the tech exec to get an MBA or attend an abbreviated MBA-like program such as those offered by Harvard Business School or Stanford.  Many business courses focused on cost justification and the business financial statements are also available from online programs.

Failure to Lead and Inspire

This failure is most common for executives “raised through the technical ranks”.  Comments from peers and CEOs of the CTO indicate that he struggles with creating a strategy that clearly supports the needs of the business.  Further, individual contributors within the organization appear to be disconnected from the mission of the business and disengaged at work.  Often, employees within such an organization will complain that the CIO/CTO requires that all major decisions go through her.  Such employees and organizations often test low on work engagement and overall morale.

The cause of this failure is again often the result of promoting a person solely upon her technical capabilities.  While there are many great technology leaders, leadership and technical acumen have very little in common.  Some folks can be trained to appreciate the value and need for a compelling vision and to be inspirational, but alas some cannot.

The fix is to attempt training, but where the executive does not show the willingness or capability, a replacement may be necessary.  Sometimes we find that mentoring from a former or current successful CIO/CTO helps.  Coaching from a professional coach or leadership professional may also be helpful. 

Failure to Manage and Execute

This failure is common for both those raised through the technical ranks, and those co-opted from the business.  At the very heart of this failure is a perceived lack of execution on the part of the executive – specifically in getting things done.  Often the cause is a lack of communication.  Sometimes the cause is poor project management capabilities within the organization or a lack of management acumen within the organization. 

For executives co-opted from the business, we find that they sometimes struggle in communicating effectively with the technical members of their team and as such don’t have a clear understanding of status.  They may also struggle with the right questions to ask to either probe for current status or to help keep their teams on track.

Where the problem is the lack of technical understanding by non-technical executives running tech functions, the fix is to augment them with strong technical people who also speak the language of business (similar to the lack of business acumen failure).  For the other failures, the fix is to build appropriate project management and oversight into product delivery.  This may be adding skills to the team, or it may mean needing to infuse an element of valuing execution into the technology organization.

Trapped in Today – Failure to Plan for Tomorrow

This type of failure is common to executives of both backgrounds.  In fact, the problem is common to leaders in all positions requiring a mix of both operational focus and forward-looking strategy development.  The needs of running the day to day business often biases the executive to what is necessary to ensure that you make a day, month or quarter at the expense of what needs to happen for the future.  As such, fast moving secular trends (past examples of which are IaaS, Agile, NoSQL, mobile adoption, etc) get ignored today and become crises tomorrow.

There are many potential fixes for this, including ensuring that executives budget their time to create “space” for future planning.  Organizationally, we may add someone in larger companies to focus on either operational needs or the needs of tomorrow.  Additionally, we can bring in outside help and perspectives either in the form of new hires or consultants who have experience with emerging trends and technologies.

Lack of Technical Knowledge

This failure is owned almost entirely by folks with a non-technology background who find themselves running technical organizations.  It manifests itself in multiple ways including not understanding what technologists are saying, not understanding the right questions to ask, and not fully understanding the ramification of various technical decisions.  While a horrible position to be in, it is no more or less disastrous than lacking business acumen.  In either scenario, we have an imperfect matching of an engine and the transmission necessary to gain benefit from that engine.

Unfortunately, this one is the hardest of all problems to fix.  Whereas it is comparatively easy to learn to speak the language of business, the breadth and depth of understanding necessary to properly run a technology organization is not easily acquired.  One need not be the best engineer to be successful, but one should certainly understand “how and why things work” to be able to properly evaluate risk and ask the right questions.

The fix is the mirror image of the fix for business acumen.  No technology executive should be without deep technical understanding AND a clear understanding of business fundamentals.

Key Learnings

Perhaps the most important point to learn from our experience in this area is to ensure that, regardless of your CTO/CIO’s background, you ensure he or she has both the technical and business skills necessary to do the job.  Identify weaknesses through interviews and daily interactions and help the executive focus on shoring up these weaknesses through skill augmentation within their organization or through education, mentoring and coaching.

AKF Partners provides mentoring and coaching for both CTOs and CEOs to help ensure a successful partnership between these two key creators of company value.  If your CEO or CTO has departed, we also provide interim leadership services.

 

Permalink

Technical Due Diligence and Technical Debt

April 30, 2018  |  Posted By: Daniel Hanley

Over the past 11 years AKF has been on a number of technical due diligences where we have seen technology organizations put off a large portion of engineering effort creating technical debt.  A technical due diligence as introduced in Optimize Your Investment with Technical Due Diligence, should identify the amount of technical debt and quantify the amount of engineering resources dedicated to service the debt.


What is Technical Debt?
Technical debt is the difference between doing something the desired or best way and doing something quickly.  Technical debt is a conscious choice, made knowingly and with commission to take a shortcut in the technology arena - the delta between the desired or intended way and quicker way.  The shortcut is usually taken for time to market reasons and is a sound business decision within reason. 
AKF Technical Debt Definition

Is Tech Debt Bad?
Technical debt is analogous in many ways to financial debt - a complete lack of it probably means missed business opportunities while an excess means disaster around the corner.  Just like financial debt, technical debt is not necessarily bad, by accruing some debt it allows the technology organization to release an MVP (minimal viable product) to customers, just as some financial debt allows a company to start new investments earlier than capital growth would allow.  Too little debt can result in a product late to market in a competitive environment and too much debt can choke business innovation and cause availability and scalability issues.  Tech debt becomes bad when the engineering organization can no longer service that debt.  A technical due diligence should discover a proper management of a team’s technical debt. 

Tech Debt Maintenance
Similar to financial debt, technical debt must be serviced, and it is serviced by the efforts of the engineering team - the same team developing the software.  A failure to service technical debt will result in high interest payments as seen by slowing time to market for new product initiatives post investment.  Our experience indicates that most companies should expect to spend 12% to 25% of engineering effort on a mix of servicing technical debt and handling defect correction (two different topics).  Whether that resource allocation keeps the debt static, reduces it, or allows it to grow depends upon the amount of technical debt and also influences the level of spend.  It is easy to see how a company delinquent in servicing their technical debt will have to increase the resource allocation to deal with it, reducing resources for product innovation and market responsiveness.  Just as a financial diligence is verifying the CFO has accountability of their balance sheet, AKF looks to verify the CTO is on top of their technology balance sheet during technology due diligence. 

AKF Technical Debt Balance Sheet

Tech Debt Takeaways:

  • Delaying attention to address technical issues allows greater resources to be focused on higher priority endeavors
  • The absence of technical debt probably means missed business opportunities – use technical debt as a tool to best meet the needs of the business
  • Excessive technical debt will cause availability and scalability issues, and can choke business innovation (too much engineering time dealing with debt rather than focusing on the product)
  • The interest on tech debt is the difficulty or increased level of effort in modifying something in subsequent releases
  • The principal of technical debt is the difference between “desired” and “actual”
  • Technology resources to continually service technical debt should be clearly planned in product road maps – 12 to 25% is suggested

RELATED CONTENT

AKF’s technical due diligence will discover a team’s ability to quantify the amount of debt accrued and the engineering effort to service the debt.  Leadership should be allocating approximately 20% of resources towards this debt and planning for the debt in the product road map. 

Click here to see how AKF Partners can help you assess your company’s technical debt, or assess the debt and issues therein for any investment you plan to make.

 

Permalink

The Top Five Most Common Agile PDLC Failures

April 27, 2018  |  Posted By: Dave Swenson
Top Five Agile Failures

Agile Software Development is a widely adopted methodology, and for good reason. When implemented properly, Agile can bring tremendous efficiencies, enabling your teams to move at their own pace, bringing your engineers closer to your customers, and delivering customer value
quicker with less risk. Yet, many companies fall short from realizing the full potential of Agile, treating it merely as a project management paradigm by picking and choosing a few Agile structural elements such as standups or retrospectives without actually changing the manner in which product delivery occurs. Managers in an Agile culture often forget that they are indeed still managers that need to measure and drive improvements across teams.

All too often, Agile is treated solely as an SDLC (Software Development Lifecycle), focused only upon the manner in which software is developed versus a PDLC (Product Development Lifecycle) that leads to incremental product discovery and spans the entire company, not just the Engineering department.


Here are the five most common Agile failures that we see with our clients:

     
  1. Technology Executives Abdicate Responsibility for their Team’s Effectiveness

Management in an Agile organization is certainly different than say a Waterfall-driven one. More autonomy is provided to Agile teams. Leadership within each team typically comes without a ‘Manager’ title. Often, this shift from a top-down, autocratic, “Do it this way” approach to a grass-roots, bottoms-up one sways way beyond desired autonomy towards anarchy, where teams have been given full freedom to pick their technologies, architecture, and even outcomes with no guardrails or constraints in place. See our Autonomy and Anarchy article for more on this. 


Executives often become focused solely on the removal of barriers the team calls out, rather than leading teams towards desired outcomes. They forget that their primary role in the company isn’t to keep their teams happy and content, but instead to ensure their teams are effectively achieving desired business-related outcomes.


The Agile technology executive is still responsible for their teams’ effectiveness in reaching specified outcomes (e.g.: achieve 2% lift in metric Y). She can allow a team to determine how they feel best to reach the outcome, within shared standards (e.g.: unit tests must be created, code reviews are required). She can encourage teams to experiment with new technologies on a limited basis, then apply those learnings or best practices across all teams. She must be able to compare the productivity and efficiencies from one team to another, ensuring all teams are reaching their full potential.

     
  1. No Metrics Are Used

The age-old saying “If you can’t measure it, you can’t improve it” still applies in an Agile organization. Yet, frequently Agile teams drop this basic tenet, perhaps believing that teams are self-aware and critical enough to know where improvements are required. Unfortunately, even the most transparent and aware individuals are biased, fall back on subjective characteristics (“The team is really working hard”), and need the grounding that quantifiable metrics provide. We are continually surprised at how many companies aren’t even measuring velocity, not necessarily to compare one team with another, but to compare a team’s sprint output vs. their prior ones. Other metrics still applicable in an Agile world include quality, estimation accuracy, predictability, percent of time spent coding, the ratio of enhancements vs. maintenance vs. tech debt paydown.

These metrics, their definitions and the means of measuring them should be standardized across the organization, with regular focus on results vs. desired goals. They should be designed to reveal structural hazards that are impeding team performance as well as best practices that should be adopted by all teams.

     
  1. Your Velocity is a Lie

Is your definition of velocity an honest one? Does it truly measure outcomes, or only effort? Are you consistent with your definition of ‘done’? Take a good look at how your teams are defining and measuring velocity. Is velocity only counted for true ‘ready to release’ tasks? If QA hasn’t been completed within a sprint, are the associated velocity points still counted or deferred?

Velocity should not be a measurement of how hard your teams are working, but instead an indicator of whether outcomes (again, e.g.: achieve 2% lift in metric Y) are likely to be realized - take credit for completion only when in the hands of customers.

     
  1. Failure to Leverage Agile for Product Discovery

From the Agile manifesto: “Our highest priority is to satisfy the customer through early and continuous delivery of valuable software”. Many companies work hard to get an Agile structure and its artifacts in place, but ignore the biggest benefit Agile can bring: iterative and continuous product discovery. Don’t break down a six-month waterfall project plan into two week sprints with standups and velocity measurements and declare Agile victory.


Work to create and deliver MVPs to your customers that allow you to test expected value and customer satisfaction without huge investment.

     
  1. Treating Agile as an SDLC vs. a PDLC

As explained in our article PDLC or SDLC, SDLC (Software Development Lifecycle) lives within PDLC (Product Development Lifecycle). Again, Agile should not be treated as a project management methodology, nor as a means of developing software. It should focus on your product, and hopefully the related customer success your product provides them. This means that Agile should permeate well beyond your developers, and include product and business personnel.


Business owners or their delegates (product owners) must be involved at every step of the PDLC process. PO’s need to be embedded within each Agile team, ideally colocated alongside team members. In order to provide product focus, POs should first bring to the team the targeted customer problem to be solved, rather than dictating only a solution, then work together with the team to implement the most effective solution to that problem.



AKF Partners helps companies transition to Agile as well as fine-tune their existing Agile processes. We can readily assess your PDLC, organization structure, metrics and personnel to provide a roadmap for you to reach the full value and benefits Agile can provide. Contact us to discuss how we can help.


Permalink

The Scale Cube

April 25, 2018  |  Posted By: Robin McGlothin

The Scale Cube - Architecting for Scale

The Scale Cube is a model for building resilient and scalable architectures using patterns and practices that apply broadly to any industry and all solutions. AKF Partners invented the Scale Cube in 2007, publishing it online in our blog in 2007 (original article here) and subsequently in our first book the Art of Scalability and our second book Scalability Rules


The Scale Cube (sometimes known as the “AKF Scale Cube” or “AKF Cube”) is comprised of an 3 axes: X-axis, Y-axis, and Z-axis.

    • Horizontal Duplication and Cloning (X-Axis )
    • Functional Decomposition and Segmentation - Microservices (Y-Axis)
    • Horizontal Data Partitioning - Shards (Z-Axis)

These axes and their meanings are depicted below in Figure 1.

AKF Scale Cube - X, Y and Z Axes Explained

                                    Figure 1

The Scale Cube helps teams keep critical dimensions of system scale in mind when solutions are designed and when existing systems are being improved. 

Most internet enabled products start their life as a single application running on an appserver or appserver/webserver combination and likely communicate with a database. This monolithic design will work fine for relatively small applications that receive low levels of client traffic. However, this monolithic architecture becomes a kiss of death for complex applications.

A large monolithic application can be difficult for developers to understand and maintain. It is also an obstacle to frequent deployments. To deploy changes to one application component you need to build and deploy the entire monolith, which can be complex, risky, time consuming, require the coordination of many developers and result in long test cycles.

Consequently, you are often stuck with the technology choices that you made at the start of the project. In other words, the monolithic architecture doesn’t scale to support large, long-lived applications.

Figure 2, below, displays how the cube may be deployed in a modern architecture decomposing services (sometimes called micro-services architecture), cloning services and data sources and segmenting similar objects like customers into “pods”.

AKF Scale Cube - Examples of X, Y and Z axis splits

                                    Figure 2



Scaling Solutions with the X Axis of the Scale Cube

The most commonly used approach of scaling an solution is by running multiple identical copies of the application behind a load balancer also known as X-axis scaling. That’s a great way of improving the capacity and the availability of an application.

When using X-axis scaling each server runs an identical copy of the service (if disaggregated) or monolith. One benefit of the X axis is that it is typically intellectually easy to implement and it scales well from a transaction perspective.  Impediments to implementing the X axis include heavy session related information which is often difficult to distribute or requires persistence to servers – both of which can cause availability and scalability problems.  Comparative drawbacks to the X axis is that while intellectually easy to implement, data sets have to be replicated in their entirety which increases operational costs.  Further, caching tends to degrade at many levels as the size of data increases with transaction volumes.  Finally, the X axis doesn’t engender higher levels of organizational scale.

Figure 3 explains the pros and cons of X axis scalability, and walks through a traditional 3 tier architecture to explain how it is implemented.

AKF Scale Cube - X Axis Splits Pros and Cons

                                    Figure 3



Scaling Solutions with the Y Axis of the Scale Cube

Y-axis scaling (think services oriented architecture, micro services or functional decomposition of a monolith) focuses on separating services and data along noun or verb boundaries.  These splits are “dissimilar” from each other.  Examples in commerce solutions may be splitting search from browse, checkout from add-to-cart, login from account status, etc.  In implementing splits,  Y-axis scaling splits a monolithic application into a set of services. Each service implements a set of related functionalities such as order management, customer management, inventory, etc.  Further, each service should have its own, non-shared data to ensure high availability and fault isolation.  Y axis scaling shares the benefit of increasing transaction scalability with all the axes of the cube.

Further, because the Y axis allows segmentation of teams and ownership of code and data, organizational scalability is increased.  Cache hit ratios should increase as data and the services are appropriately segmented and similarly sized memory spaces can be allocated to smaller data sets accessed by comparatively fewer transactions.  Operational cost often is reduced as systems can be sized down to commodity servers or smaller IaaS instances can be employed.

Figure 4 explains the pros and cons of Y axis scalability and shows a fault-isolated example of services each of which has its own data store for the purposes of fault-isolation.

AKF Scale Cube - Y Axis Services Splits Pros and Cons

                                    Figure 4



Scaling Solutions with the Z Axis of the Scale Cube

Whereas the Y axis addresses the splitting of dissimilar things (often along noun or verb boundaries), the Z-axis addresses segmentation of “similar” things.  Examples may include splitting customers along an unbiased modulus of customer_id, or along a somewhat biased (but beneficial for response time) geographic boundary.  Product catalogs may be split by SKU, and content may be split by content_id.  Z-axis scaling, like all of the axes, improves the solution’s transactional scalability and if fault isolated it’s availability. Because the software deployed to servers is essentially the same in each Z axis shard (but the data is distinct) there is no increase in organizational scalability.  Cache hit rates often go up with smaller data sets, and operational costs generally go down as commodity servers or smaller IaaS instances can be used.

Figure 5 explains the pros and cons of Z axis scalability and displays a fault-isolated pod structure with 2 unique customer pods in the US, and 2 within the EU.  Note, that an additional benefit of Z axis scale is the ability to segment pods to be consistent with local privacy laws such as the EU’s GDPR.

AKF Scale Cube - Z Axis Splits Pros and Cons

                                    Figure 5


Summary

Like Goldilocks and the three bears, the goal of decomposition is not to have services that are too small, or services that are too large but to ensure that the system is “just right” along the dimensions of scale, cost, availability, time to market and response times.


AKF Partners has helped hundreds of companies, big and small, employ the AKF Scale Cube to scale their technology product solutions.  We developed the cube in 2007 to help clients scale their products and have been using it since to help some of the greatest online brands of all time thrive and succeed.  For those interested in “time travel”, here are the original 2 posts on the cube from 2007:  Application Cube, Database Cube

Permalink

Achieving Results, Culture, and the AKF Equation

April 24, 2018  |  Posted By: Marty Abbott

We can tell a lot about a company within the first hour or so of any discussion.  Consider the following statement fragments:

“We have a lot of smart people here…”

Or

“We have some of the hardest working engineers…”

Contrast these with the following statement fragments:

“We measure ourselves against very specific business KPIs…”

Or

“We win or lose daily based on how effectively we meet our customer expectations…”

There is a meaningful difference in the impact these two groups of statements have on a company’s culture and how that culture enables or stands in the way of success.  The first group of statements are associated with independent variables (inputs) that will rarely in isolation result in desired outcomes (dependent variables).  When used in daily discourse, they reinforce the notion that something other than outcomes are the things upon which a company prides itself.  Our experience is that these statements create an environment of hubris that often runs perpendicular to, and at best in no way reinforces, the achievement of results.  Put another way, when we hear statements like this, we expect to find many operational problems.

The second group of statements are focused on meaningful and measurable outcomes.  The companies with which we’ve worked that frequently communicate with these statements are among the most successful we’ve seen.  Even when these companies struggle, their effort and focus is solidly behind the things that matter – those things that create value for the broadest swath of stakeholders possible.

The point here is that how we focus communication inside our companies has an important impact on the outcomes we achieve.


Outcomes First

Success is often a result of several independent variables aligning to achieve a desired outcome.  These may include, as Jim Collins points out, being in the right place at the right time – sometimes called “luck”.  Further, there is rarely a single guaranteed path to success; multiple paths may result in varying levels of the desired outcome.  Great companies and great leaders realize this, and rather than focusing a culture on independent variables they focus teams on outcomes.  By focusing on outcomes, leaders are free to attempt multiple approaches and to tweak a variety of independent variables to find the most expedient path to success.  We created the AKF Equation (we sometimes refer to it as the AKF Law) to help focus our clients on outcomes first:

Two very important corollaries follow from this equation or “law”:

And…


Examples of Why Results, and Not Paths Matter

Intelligence Does Not Equal Success

As an example of why the dependent variable of results and not an independent variable like intelligence is most important consider Duckworth and Seligman’s research.  Duckworth and Seligman and associates (insert link) conducted a review of GPA performance in adolescents.  They expected to find that intelligence was the best indication of GPA.  Instead, they found that self-discipline was a better indication of the best GPAs:

Lewis Terman, a mid-20th century Stanford Pyschology professor hypothesized that IQ was highly correlated with success in his famous termite study of 1500 students with an average IQ of 151.  Follow on analysis and study indicated that while these students were successful, they were half as successful as a group of other students with a lower IQ.

Chris Langan, the world’s self-proclaimed “most intelligent man” with an IQ of 195 can’t seem to keep a job according to Malcolm Gladwell.  He’s been a cowboy, a stripper, a day laborer and has competed on various game shows.

While we’d all like to have folks of average or better intelligence on our team, the above clearly indicates that it’s more important to focus on outcomes than an independent variable like intelligence.

Drive Does Not Equal Success
While most successful companies in the Silicon Valley started with employees that worked around the clock, The Valley is also littered with the corpses of companies that worked their employees to the bone. Just ask former employees of the failed social networking company Friendster.  Hard work alone does not guarantee success.  In fact, most “overnight” success appears to take about 10,000 hours of practice just to be good, and 10 years of serious work to be truly successful (according to both Ramit Sethi and Malcolm Gladwell.)

Hard work, if applied in the right direction and to the right activities should of course help achieve results and success.  But again, it’s the outcome (results and success) that matter.


Wisdom Does Not Equal Success
Touting the age and experience of your management team?  Think again.  There’s plenty of evidence that when it comes to innovative new approaches, we start to “lose our touch” at the age of 40.  The largest market cap technology companies of our age were founded by “youngsters” – Bezos being the oldest of them at the age of 30.  Einstein posited all of his most significant theories well before the age of 40 – most of them in the “miracle year” at the age of 26 in 1905.  The eldest of the two Wright Brothers was 39.

While there are no doubt examples of successful innovation coming after the age of 40, and while some of the best managers and leaders we know are over the age of 40, wisdom alone is not a guarantee for success.


Only Success = Success

Most of the truly successful and fastest growing companies we know focus on a handful of dependent variables that clearly identify results, progress and ultimately success.  Their daily manners and daily discourse are carefully formulated around evaluation of these success criteria.  Even these company’s interaction with outside firms focuses on data driven indications of success time and time again – not on independent variables such as intelligence, work ethic, wisdom (or managerial experience), etc.

These companies identify key performance indicators for everything that they do, believing that anything worth doing should have a measurable value creation performance indicator associated with it.  They maniacally focus on the trends of these performance indicators, identifying significant deviations (positive or negative) in order to learn and repeat the positive causes and avoid those that result in negative trends.  Very often these companies employ agile planning and focus processes similar to the OKR process.

The most successful companies rarely engage in discussions around spurious relationships such as intelligence to business success, management experience to business success, or effort to business success.  They recognize that while some of these things are likely valuable in the right proportions, they are rarely (read never) the primary cause of success. 


AKF Partners helps companies develop highly available, scalable, cost effective and fast time-to-market products.  We know that to be successful in these endeavors, companies must have the right people, in the right roles, with the right behaviors - all supported by the right culture.  Contact us to discuss how we can help with your product needs. 

Permalink

Evaluating Technology Risk Using Mathematics

April 23, 2018  |  Posted By: Geoffrey Weber

The Relative Risk Equation

Technologists are frequently asked: what are the chances that a given software release is going to work?  Do we understand the risk that each component or new feature brings to the entire release?  

In this case, measuring risk is assessing the probability that a component will perform poorly or even fail. The higher the probability of failure, the higher the risk.  Probabilities are just numbers, so ideally, we should be able to calculate the risk (probability of failure) of the entire release by aggregating the individual risk of each component.  In reality this calculation works quite well.

Putting a number to risk is a very useful tool and this article will provide a simple and easy way to calculate risk and produce a numeric result which can be used to compare risk across a spectrum of technology changes.

When we assess a system, one of the key characteristics we want to benchmark is the probability that a system will fail.  In particular, if we want to understand whether or not a system can support an availability goal of 99.95% we have to do some analysis to see if the probability that a failure occurs is lower than 0.05%. How do we calculate this?

First let’s introduce some vocabulary. 

DEFINITION: Pi is the probability that a given system will experience an incident, i. 

For the purposes of this article we are measuring relative and not absolute values.  A system where Pi=1 means the system is very unlikely to experience failure.  On other hand, Pi values approaching 10 indicate a system with a 100% probability of failure.  

DEFINITION: Iis the impact (or blast radius) a system failure will have.  

Ii=1 indicated no impact where Ii=10 indicates a complete failure of an entire system.

DEFINITION:  Pd is the probability that an incident will be detected.  

Pd=1 means an incident will be completely undetected and Pd=10 indicates that a failure will be completely detected 100% of the time.

Measuring across a scale from 1 to 10 is often too granular;  we can reduce scale to tee-shirt sizes and replace 1, 2, …, 10 with Small (3), Medium (5), Large (7). Any series of values will work so long as we are consistent in our approach.  

Relative Risk is now only a question of math: 

Ri = (Pi x  I) / Pd

With values of 1, 2, ..., 10, the minimum Relative Risk value is 0.1 (effectively 0 relative risk) and the maximum value is 100.  With tee-shirt sizes, the minimum Relative Risk value is 6/7 and the maximum value is 16.333.   Basic statistics can help us to standardize values from 1 to 10:

std(Ri)  = (Ri - Min(Ri)) / (Max(Ri)  - Min(Ri)) x 10 
where Max(Ri) = 16 1/3 and Min(Ri) = 1 2/7 (in the case of tee-shirt sizing)

For example:

Example 1: Adding a new data file to a relational database

  • Pi = 3 (low.). It’s unlikely that adding a data file will cause a system failure, unless we’re already out of space.
  • I = 5 (medium.). A failure to add a datafile indicates a larger storage issue may exist which would be very impactful for this database instance.  However as there is a backup (for this example), the risk is lowered.
  • Pd = 7 (high).  It is virtually certain that any failure would be noticed immediately.</li>
  • Therefore Ri = (3 X 5) / 7 = 2.1. Standardizing this to our 1 to 10 scale produces a value of 0.51.  That is a very low number, so adding datafiles is relatively low risk procedure.


Example 2: Database backups have been stored on tapes that have been demagnetized during transportation to offsite storage.

  • P= 5 (medium.). While restoring a backup is a relatively safe event, on a production system it is likely happening during a time of maximum stress.
  • I = 7 (high.). When we attempt to restore the backup tape, it will fail.
  • Pd= 3 (low.)  The demagnetization of the tapes was a silent and undetected failure.
  • Therefore Ri = (5 X 7) / 3 or 11.7.  We arrive at a value of 7 on the standard scale, which is quite high, so we should consider randomly testing tapes from off-site storage.


This formula has utility across a vast spectrum of technology:

  • Calculate a relative risk value for each feature in a software release, then take the total value of all features in order to compare risk of a release against other releases and consider more detailed testing for higher relative risk values.
  • During security risk analysis, calculate a relative risk value for each threat vector and sort the resulting values.  The result is a prioritized list of steps required to improve security based on the probabilistic likelihood that a threat vector will cause real damage.
  • During feature planning and prioritization exercises, this formula can be altered to calculate feature risk.  For example, Pi can mean confidence in estimate, Ii can be converted to impact of feature (e.g. higher revenue = higher impact) and Pd is perceived risk of the feature.  Putting all features through this calculation then sorting from high values to low values yields a list of features ranked by value and risk.

The purpose of this formula and similar methods is not to produce a mathematically absolute estimate of risk.  The real value here is to remove guessing and emotion from the process of evaluating risk and providing a framework to compare risk across a variety of changes.  

Click here to see how AKF Partners can help you manage risk and other technology issues.

Permalink

Avoiding the Policy Black Hole

April 18, 2018  |  Posted By: Pete Ferguson

During due diligence and in-depth engagements, we often hear feedback from client employees that policies either do not exist - or are not followed.

All too often we see policies that are poorly written, difficult for employees to understand or find, and lack clear alignment with the desired outcomes.  Policies are only one part of a successful program - without sound practices, policies alone will not ensure successful outcomes.

Do You Have a Policy …?

Early in my career I was volunteered to be responsible for PCI compliance shortly after eBay purchased PayPal.  I’d heard folklore of auditors at other companies coming in and turning things over with the resulting aftermath leading to people being publicly humiliated or losing their job.  I suddenly felt on the firing line and asking “why me?”

I booked a quick flight to Phoenix to be in town before the auditor arrived and I prepared by walking through our data center and reviewing our written policies.  When I met with the auditor, he looked to be in his early 20s and handed me a business card from a large accounting firm.  I asked him about his background; he was fresh out of college and we were one of his first due diligence assignments.  He pulled out his laptop and opened an Excel spreadsheet and began reading off the list:

  • Do you have cameras?  “Yes,” I replied and pointed to the ceiling in the lobby littered with little black domes.
  • Do you record the cameras?  “Yes,” and I took him into the control room and showed him that we had 90 days of recording.
  • Do you have a security policy?  “Yes,” and I showed him a Word Document starting with “1.1.1 Purpose of This Policy ....”

Several additional questions, and 10 minutes later, we were done.  He and I had both flown some distance so I gave him a tour of the data center and filled him full of facts about square footage and miles of cable and pipes until his eyes glossed over and his feet were tired from walking and off he went.

I was relieved, but let down!  I felt we had a really good program and wanted to see how we measured up under scrutiny.  Subsequent years brought more sophisticated reviews - and reviewers - but the one question I was always waiting to be asked - but never was:

“Is your policy easily accessible, how do employees know about it, and how do you measure their comprehension and compliance?”

My first compliance exercise didn’t seem all that scary after all, it was only a due diligence “check the box” exercise and didn’t dive deeper into how effective our program was and where it needed to be reinforced.

While having a policy for compliance requirements is important, on its own, policy does not guarantee positive outcomes.  Policy must be aligned with day-to-day operations and make sense to employees and customers.


The Traditional Boredom of Policy

Typically policy is written from the auditor’s point of view to ensure compliance to government and industry requirements for public health, anti-corruption, and customer data security standards. 


Image Credit: Imgur.com

Unfortunately, this leads to a very poor user experience wading through the 1.1.1 … 1.1.2 … . Certainly a far deviation from how a good novel or any online news story reads.

I’ve heard companies - both large and small - give great assurances that they have policies and they have shown me the 12pt Times New Roman documents that start with “1.1.1 Purpose of This Policy …” as evidence.

I had to argue the point at a former position that the first way to lose interest with any audience is to start with 1.1.1 … and with Times New Roman font in a Microsoft Word document that was not easy to find.  It was a difficult argument and I was instructed to stick with the approved, and traditional, industry-accepted method.

Fast forward a decade later and our HR Legal team was reviewing policy and invited me to a meeting with the internal communications team.  Before we started talking documents, the Director of Communications asked me if I’d seen the latest safety video for Virgin Atlantic Airlines.  I thought it a strange question, but after she told me how surprised and inspired by it she was, I took a look.

VA thankfully took a required dull and mundane US Federal Aviation Administration ritual and instead saw it as a differentiator of their brand from the pack of other airlines.  Whoever thought a safety demonstration could also be a 4-minute video on why an airline is different and fun?!?  Up until that point, no one!  Certainly not on any flight I had previously flown.

Thankfully, since then, Delta and others have followed their example and made something I and millions of airline crews and passengers had previously dreaded - safety policy and procedure - into a more fun, engaging, and entertaining experience.

While policy needs to comply with regulations and other requirements, for policies to move from the page to practice they need to be presented in a way employees clearly understand what is expected - so in writing policy, put the desired outcome first!  The regulatory document for auditors can be incorporated at the end of each policy or consider a separate document that calls out only the required sections of your employe handbook or where ever your company policies are presented and stored.

Clarifying the Purpose of Your Policy

In her article “Why Policies Don’t Work,” HR Lawyer Heather Bussing boils down the core issue: “There are two main reasons to have employment policies: to educate and to manage risk.  The trouble is that policies don’t do either.”

She further expounds on the problem in her experience:

“ … policies get handed out at a time when no one pays attention to them (first week of employment if not the first day), they are written by people who don’t know how the company really works (usually outside legal counsel), and they have very little to do with what happens.  So much for education.”

As for managing risk, Bussing points out that policies are often at odds with each other, or so broad that they can’t be effectively enforced.

“Unless it is required to be on a poster, or unless you can apply it in every instance without variance, you don’t want policies.  Your at-will policy covers it.  And if you don’t follow your policies to the letter, you will look like a liar in a courtroom.”

Don’t let your online policy repository feel like a suppository - focus on what you want to accomplish! 

Small and fast-growing companies typically have little need for formalized policies because people trust each other and can work things out.  But as they grow it has been my experience that often the trust and holding people accountable - which sets fast growing companies apart as a cool place to work - get replaced with bureaucratic rituals cemented in place as more and more executives migrate from larger, bureaucratic behemoths.  If the way policy is presented is the litmus test for the true company culture, a lot of companies are in trouble!

Policy must be closely aligned to the shared outcomes of the company and interwoven into company culture.  Otherwise they are a bureaucratic distraction and will only be adopted or sustained with a lot of uphill effort.  In short, if people do not understand how a policy helps them do their job more easily, they are going to fight it. 


Adapting Policy To Your Audience

In the early days of eBay, the culture was very much about collectables, and walking through the workspace many employees displayed their collections of trading cards, Legos, and comic books.  When it came time to publish our security policies, we hired Foxnoggin - a professional marketing strategy company - and took the time to get to understand our culture and then organized a comprehensive campaign to include contests, print and online material, and other collateral. 

They helped formulate an awareness campaign to educate employees and measure the effectiveness of policy through surveys and monitoring employee behavior.

To break away from the usual email method of communication, we got and held employee attention with a series of comic books which included superheroes and supervillains in a variety of scenarios highlighting our policies.

An unintended consequence from our collector employees was that they didn’t want to open their comic books and instead kept them sealed in plastic.  To combat this, we provided extra copies (not sealed in plastic) in break rooms and other common areas and future editions were provided without the bags.  The messages were reinforced with large movie-style posters displayed throughout the work area. 

This approach was wildly popular among employees located at the customer support and developer sites and surveys showed that security was becoming a top of mind topic for employees.  Unfortunately, this approach was not as popular with Europeans - who felt we were talking down to them - and by the executives coming from more stodgy and formal companies like Bain & Company or GE and particularly unpopular with execs from the financial industry after the purchase of PayPal.

Intertwining policy into the culture of your organization makes compliance natural and part of daily operations.

Make Sure Your Message Matches Your Audience

President and CEO of Lead From Within Lolly Daskal writes on Inc.com:

“... sometimes the dumbest rules can drive away the best employees … too many workplaces create rule-driven cultures that may keep management feeling like things are under control, but they squelch creativity and reinforce the ordinary.”

Be creative and look at the company culture and how to interweave policies.  Policies need to be part of the story you tell your employees to reinforce why they should want to work for you.

Nathan Christensen writes in his Fast Company article: How to Create An Employee Handbook People Will Actually Want to Read, “let’s face it, most handbooks aren’t exactly page-turners. They’re documents designed to play defense or, worse yet, a catalog of past workplace problems.”

Christensen recommends “presenting” policies in a readable and attractive manner.  It must be an opportunity to excite people in meeting a greater group purpose and cause.

Your policies need to match your company culture, be in language they use and and understand, and the ask for compliance needs to be easily enough for a new employee to be able to explain to anyone.

Writing Content Your Audience Will Actually Read and Understand

According to the Center for Plain Language - which has the goal to help organizations “write so clearly that their intended audience understands what they are saying the first time they read or hear it” - there are five steps to plain language:

  1. Identify and describe the target audience: “The audience definition works when you know who you are and are not designing for, what they want to do, and what they know and need to learn.”
  2. Structure the content to guide the reader through it: “The structure works when readers can quickly and confidently find the information they are looking for.”
  3. Write the content in plain language: “Use a conversational, rather than legal our bureaucratic tone … pick strong verbs in the active voice and use words the audience knows.”
  4. Use information design to help readers see and understand: Font choice, line spacing, and use of graphics help break up long sections of text and increase the readability score.
  5. Work with target user groups to test design and content: Ask readers to describe the content and have them show you where they would find relevant content.

As an illustration, here is a before and after comparison of the AARP Financial policy on giving and receiving gifts:

In reading the “before” example, my eyes immediately glazed over and my mind began to wander until the mention of “courtesies of a de minimus ... “ Did the guy who wrote that go home that night to his family and instruct his kids, “you will need to consume a courtesise of a de minimus amount of broccoli if you want videogame time after dinner”?  I sure hope not!

On the “after” example, notice the change in line spacing, switching of font and use of bullet points.  Overall the presentation is a lot more conversational and less formal.  It also has a call to action in the title starting with two verbs “give and accept …”

I’d add as the 6th step to remember K.I.S.S. - Keep It Simple Stupid!  You get a few seconds to grab your audience’s attention and only a few more minutes to keep it. 

As a content editor, I was feeling proud of myself when I distilled 146 pages of confusing policies, procedures and “how to” down to 14 pages over the course of several weeks.  But when I mentioned this to my wife, she said “you are going to make them read 14 pages?!?” 

So I looked at it a few days later with fresh eyes and realized I could condense it down again to two pages by making it more of a table of contents with a brief description of each bullet point and then include links after each section if employees wanted to learn more, and I was able to retain a font size of 14 and plenty of white space.

In reading the two pages, people would understand what was expected of them and could easily learn more - but only if they were interested.

Write policy in language a new employee will quickly understand and be thoughtful in how much you present to employees on their first day, week, and month.


Document Readability is How You Show Your People Love - And Soon To Be the Law In the EU

Speaking more in terms of content marketing, VisibleThread author “Fergle” quotes Neil Patel, columnist for Forbes, Inc, as stating “content that people love and content that people can read is almost the same thing.”  Yet, as Fergle points out, “a lot of content being created is not the stuff people love. Or read.”

“Content that people love and content that people can read is almost the same thing.”

Writing content with the aim of it being easy to read as something people love may seem a bit altruistic.  But for information regarding data privacy, it is also soon to be the law - at least in the EU and for any international policy which would reach an EU resident.  On May 28th of 2018 the General Data Protection Regulation (GDPR) goes into effect.  From the GDPR :

“The principle of transparency requires that any information and communication relating to the processing of those personal data be easily accessible and easy to understand, and that clear and plain language be used.”

There are a number of ways to measure readability ease and grade level of your content, and a good communications expert will be able to help you identify the proper tools.

Scores are a good benchmark, but don’t forget the most important resource for feedback - your potential audience! 

Buy them lunch, have them come and review your plan and provide their feedback.  Bring them back in later when you have content to review and provide an environment where they can be brutally honest - again a communications expert outside of your department will help provide a bit of a buffer and allow your audience to be open, honest, and direct.

But don’t just write policy to comply with due diligence or for policy’s sake - be sure it is part of the company culture, easy to search, and placed where and when your employees or customers will need it.  When there are shared outcomes between compliance and how employees operate, policy is integrated and effective. 

Timing is Important

Think of ways to break down your policy content not just by audience, but by timing and when the information will actually be relevant.

In retail, the term “point of sale” refers to the checkout process - when taxes, final cost and payment are all settled.  The placement of “last minute items” at the POS is very carefully, and competitively assigned only to items with a high ROI measured by the amount of inches each item takes up on the limited shelf space.  This careful placement has also been adopted to the online marketplace when you add an item to your shopping cart and a prompt arises to add additional items others have also purchased with your item.

This same methodology in thinking should be applied to where - and when - you introduce your policies to your audience.

We made the mistake for years of pushing our travel safety program and policies for everyone during new hire orientation when only about half of the population traveled and most of them wouldn’t be traveling for several weeks or months.  It made a lot more sense to move the travel policies to the travel booking page.

If you only give out corporate credit cards to Directors and above, there is no sense pushing policies on spend limits to the global population.  It makes a lot more sense to push the policy when someone is applying for the card and as a reminder each time their credit card expires and they are being issued a new one.

Your audience will appreciate only being told what they need to know when they need the information and will be more likely to not only retain the information, but to comply!


    For similar content on our Growth Blog, click here


Know How You Will Measure Successful Outcomes

Perhaps the most important question to ask when designing policy is “how we will know we are successful?” 

Having good policy written in a clear and concise manner and stored in an easy to find location is still a very passive approach.  Good policy should evolve as your company evolves and should be flexible and realistic to business, customer, and employee needs.  It must be modeled by company leadership and hold true to the daily actions of your company.

Tests at the end of annual compliance training are only a “check the box” measure of compliance.  Think back to how much you actually learned - or, better yet, retained - the last time you were subjected to hours of compliance training!

If metrics cannot support that your policies are known and followed, then you need to re-evaluate the purpose of your policies and if they are contributing to the benefit of your employees and customers or just ticking compliance boxes.

While compliance is important, compliance alone does not make for better business practices or a competitive edge.  Effective, measurable compliance protects your employees and provides value to your customers.

Getting Started

Subject-matter experts are often too close to the policies to be objective.  A little tough love is needed and it is best to bring in experts in marketing and communications who will not be biased to the content, but biased to the reader who is the intended audience. 

A good communications plan will cover the following:

  • Be clear on the desired behavior the policy is to encourage and enforce - and that behavior is streamlined with the overall company purpose
  • Identify the target audiences and each of their self-interests
  • Outline which channels each audience is receptive to (email/print/video, etc.)
  • Identify the inside jargon and language styles needed
  • Decide when and where each audience will want to find relevant information
  • Plan how often policies will be reviewed - and include as many stakeholders as possible in the review process
  • Decide how implementation of policies and compliance to the policies will be measured

Only AFTER the communications plan is agreed upon - with plenty of input from representatives of the target audiences - should the content review begin.  Otherwise the temptation from subject-matter experts will be to tell people everything they know.


Pulling it All Together

Poorly written policies that are difficult for employees to search or find do little to meet the mission of policy: to provide a consistent approach to how your company does business and satisfies regulatory compliance.  Policies on their own do not make for good operations or guarantee overall success.  Remember the true test of policies is not whether they exist, but if they are tightly aligned and incorporated into daily operations, how they contribute to the success of your employees and customers, and if their effectiveness can be measured in a tangible way.

—-

Experiencing growing pains?  AKF is here to help!  We are an industry expert in technology scalability and due diligence.  Put our 200+ years of combined experience to work for you today!

Get this article and others like it by signing up for our newsletter.

Permalink

 < 1 2 3 4 5 >  Last ›