July 18, 2019 | Posted By: Pete Ferguson
The Tail (pun intended) of Dorothy-Boy the Goldfish
When my now-adult son was 5, he was constantly enamored in the pet aisle of the local superstore of the vast variety of fish of many sizes and colors and eventually convinced us to buy a goldfish.
We paid under $20, bowl, food, rocks, props and all, and “Dorothy-boy” came home with us (my son’s idea for a name, not mine). Of course, there were several mornings that Dorothy-boy was found upside down and a quick trip to the store and good scrubbing of the bowl remedied potential heartbreak before my son even knew anything was wrong.
Contrast that to my grandmother’s beloved Yorkie Terrier, Sergeant. When Sarge got sick, my grandmother spent thousands on doctor’s office visits, specialized food, and several surgeries. The upfront cost of a well-bred dog was significant enough, the annual upkeep for poor little Sarge was astronomical, but he lived a good, spoiled, and well-loved life.
That is why at AKF we often use the analogy of “goldfish, not thoroughbreds” with our clients to help them make decisions on hardware and software solutions.
If a “pizzabox” 1U Dell or HP (or pick your brand) server dies, no biggie, probably have a few others laying around or can purchase and spin up new ones in days, not months or quarters of a year. Also allows for quickly adding additional web servers, application servers, test servers, etc. The cost per compute cycle is very low and can be scaled very quickly and affordably.
“Cattle not pets” is another way to think about hardware and software selection. When it comes to your next meal (assuming you are not a vegetarian), what is easier to eat with little thought? A nameless cow or your favorite pet?
If your vendor is sending you on annual vacations (err, I mean business conferences) and providing your entire team with tons of swag, you are likely paying way too much in upfront costs and ongoing maintenance fees, licensing, and service agreements. Sorry, your sales rep doesn’t care that much about you; they like their commissions based on high markups better.
Having an emotional attachment to your vendors is dangerous as it removes objectivity in evaluating what is best for your company’s customers and future.
Untapped Capacity at a Great Cost
It is not uncommon for monolithic databases and mainframes to be overbuilt given the upfront cost resulting in only being utilized at 10-20% of capacity. This means there is a lot of untapped potential that is being paid for – but not utilized year-over-year.
Trying to replace large, propriety systems is very difficult due to the lump sum of capital investment required. It is placed on the CapEx budget SWAG year after year and struck early on in the budgeting process as a CFO either dictates or asks, “can you live without the upgrade for one more year?”
We have one client that finally got budget approval for a major upgrade to a large system, and in addition to the substantial costs for hardware and licensing of software, they also have over 100 third-party consultants on-site for 18 months sitting in their cubicles (while they are short on space for their employees) to help with the transition. The direct and in-direct costs are massive, including the innovation that is not happening while resources are focused on an incremental upgrade to keep up, not get ahead.
The bloat is amazing and it is easy to see why startups and smaller companies build in the cloud and use opensource databases and in the process, erode market share from the industry behemoths with a fraction of the investment.
The goal of commodity systems and solutions is to get as much value for as minimal of an investment as possible. This allows us to build highly available and scalable solutions.
Focus on getting the maximum performance for the least amount of cost for:
We often see an interesting dichotomy of architectural principles within aging companies – teams report there is “no money” for new servers to provide customers with a more stable platform, but hundreds of thousands of dollars sunk into massive databases and mainframes.
Vendor lock and budget lock are two reasons why going with highly customized and proprietary systems shackles a company’s growth.
Forget the initial costs for specialized systems – which are substantial – usually the ongoing costs for licensing, service agreements, software upgrade support, etc. required to keep a vendor happy would likely be more than enough to provide a moderately-sized company with plenty of financial headroom to build out many new redundant, and highly available, commodity servers and networks.
Properly implementing along all three axes of the AKF Scale Cube requires a lot of hardware and software - not easily accomplished if providing a DR instance of your database also means giving your first-born and second-born children to Oracle.
Does this principle apply to cloud?
With the majority of startups never racking a single server themselves, and many larger companies migrating to AWS/Azure/Google, etc. – you might think this principle does not apply in the new digital age.
But where there is a will (or rather, profit), there is a way … and as the race for who can catch up to Amazon for hosting market share continues, vendor-specific tools that drive up costs are just as much of a concern as proprietary hardware is in the self-hosting world.
Often our venture capitalists and investor clients ask us about their startup’s hosting fees and if they should be concerned with the cost outpacing financial growth or if it is usual to see costs rise so quickly. Amazon and others have a lot to gain for providing discounted or free trials of proprietary monitoring, database, and other enhancements in hopes that they can ensure better vendor lock – and fair enough – a service that you can’t get with the competition.
We are just as concerned with vendor lock-in the cloud as we are with vendor lock-in for self-hosted solutions during due diligence and architectural reviews of our clients.
- Commodity hardware allows companies faster time to market, scalability, and availability
- The ROI on larger systems can rarely compete as the costs are such a large barrier to entry and often compute cycles are underutilized
- The same principles apply to hosted solutions – beware of vendor specific tools that make moving your platform elsewhere more difficult over time
We have held hundreds of on-site and remote architectural 2 and 3-day reviews for companies of all sizes in addition to thousands of due diligence reviews for investors. Contact us to see how we can help!
July 15, 2019 | Posted By: Larry Steinberg
It’s never been a better time to be a hacker or developer of malware as nearly every company has or is moving core functionality online and this makes these assets an open target for bad actors. Companies who are moving to deliver services from the cloud vs the traditional on premise mode have now taken on the liability for operating their product and the associated security (or lack thereof). When acquiring or investing in a company there will be significant damage to value or reputation (or both) if a vulnerability is released into production and impacts the critical customer base.
Product security goes well beyond the network and system perimeter of old. Application functionality drives the need for making the product accessible to many different types of consumers. The traditional approach of locking down the perimeter and performing a late stage penetration test prior to release has many pitfalls:
- Performing security testing at the end of the development cycle makes planning for a release date nearly impossible, or at the minimum, non-deterministic prior to security test results becoming available.
- Patterns in coding tend to repeat themselves. So if you poorly code a database interaction (SQL injection), that will probably get replicated 10s or 100s of times. This leads to far more fixes in the later stage of delivery vs setting the best practice at the beginning.
- Context switching is more costly than you might imagine. When 100k+ lines of code have been written, going back to fix code from the beginning or middle of the cycle requires considerable effort in ramping back up appropriate knowledge.
The overall goals for enhancing product security posture are multi-faceted:
- Move security controls ‘left’ in the SDLC process (closer to the beginning).
- Augment existing process with secure coding best practices.
- Product security as a continuous concern – security must be as agile as your product and team.
- Implement tools and controls via automation to achieve efficiency and compliance.
- Enable planning for security remediations.
- Reduce the expense of building secure software. Many studies have shown the expense of finding a bug late in the SDLC process will be 6-15x more than addressing the bug early in the design and coding process. Security vulnerabilities model the same expense curve.
Here are the phases of SDLC and suggestions for implementation:
During the component and system design phase, architecture reviews should focus on security threats in addition to the standard technical oversight. Identifying all threats to the system and designs are critical along with creating mitigations before the implementation phase begins, or at least early in the implementation phase.
There are multiple options for identifying security issues while writing code and performing check-ins. Static code analysis can be triggered on check-in to assess code for identifying critical issues from the OWASP top 10 and other similar known bad patterns. Some IDE’s also have the ability to assess code prior to check-in or code review. In both cases, the developer is notified nearly immediately when they’ve produced a potential vulnerability.
As the code is built and linked with dependencies, you want to scan for free and open source software. Utilizing open source is great but you need to assess the risk to IP and malware. You should always know the origins of your code base and make intentional decisions about inclusion. If you have a containerized environment, then the build phase is where you would implement some form of image scanning.
The integration test phase (where all of your components come together for verification) provides a great place for gray box security testing. These security tests would avoid perimeter style tools like WAFs and firewalls allowing the tests to focus on runtime features. Common exploits can be validated like cross-site scripting, SQL injections, XML injections, etc.
Traditionally this is where 3rd party penetration tests would occur in pre-production or production environments. Containerization provides another opportunity to oversee the runtime environment to ensure malware has not made its way into the system or replicating across the system. When running containers you should be very thoughtful in how you compose, group, and manage the assets.
By implementing the right processes and tools in the right phases, your team will be informed of issues earlier in the SDLC process and reduce the overall cost of development while maintaining the ability to plan for highly secure product deliveries. In an agile environment, automation and process augmentation can enable security to be a continuous concern. Over time, by building security controls into the SDLC process, security becomes a core part of the culture and everyday awareness.
At AKF we assist companies in enhancing their security posture. Let us know how we can help you.
July 11, 2019 | Posted By: Marty Abbott
Attempting to transform a company to compete effectively in the Digital Economy is difficult to say the least. In the experience of AKF Partners, it is easier to be “born digital” than to transform a successful, long tenured business, to compete effectively in the Digital age.
There is no single guaranteed fail-safe path to transformation. There are, however, 10 principles by which you should abide and 3 guaranteed paths to failure.
Avoid these 3 common mistakes at all costs or suffer a failed transformation.
Having the Wrong Team and the Wrong Structure
If you have a successful business, you very likely have a very bright and engaged team. But unless a good portion of your existing team has run a successful “born digital” business, or better yet transformed a business in the digital age, they don’t have the experience necessary to complete your transformation in the timeframe necessary for you to compete. If you needed lifesaving surgery, you wouldn’t bet your life on a doctor learning “on the job”. At the very least, you’d ensure that doctor was alongside a veteran and more than likely you would find a doctor with a successful track record of the surgery in question. You should take the same approach with your transformation.
This does not mean that you need to completely replace your team. Companies have been successful with organization strategies that include augmenting the current team with veterans. But you need new, experienced help, as employees on your team.
Further, to meet the need for speed of the new digital world, you need to think differently about how you organize. The best, fastest performing Digital teams organize themselves around the outcomes they hope to achieve, not the functions that they perform. High performing digital teams are
It also helps to hire a firm that has helped guide companies through a transformation. AKF Partners can help.
Planning Instead of Doing
The digital world is ever evolving. Plans that you make today will be incorrect within 6 months. In the digital world, no plan survives first contact with the enemy. In the old days of packaged software and brick and mortar retail, we had to put great effort into planning to reduce the risk associated with being incorrect after rather long lead times to project completion. In the new world, we can iterate nearly at the speed of thought. Whereas being incorrect in the old world may have meant project failure, in the new world we strive to be incorrect early such that we can iterate and make the final solution correct with respect to the needs of the market. Speed kills the enemy.
Eschew waterfall models, prescriptive financial models and static planning in favor of Agile methodologies, near term adaptive financial plans and OKRs. Spend 5 percent of your time planning and 95% of your time doing. While in the doing phase, learn to adapt quickly to failures and quickly adjust your approach to market feedback and available data.
The successful transformation starts with a compelling vision that is outcome based, followed by a clear near-term path of multiple small steps. The remainder of the path is unclear as we want the results of our first few steps to inform what we should do in the next iteration of steps to our final outcome. Transformation isn’t one large investment, but a series of small investments, each having a measurable return to the business.
Knowing Instead of Discovering
Few companies thrive by repeatedly being smarter than the market. In fact, the opposite is true – the Digital landscape is strewn with the corpses of companies whose hubris prevented them from developing the real time feedback mechanisms necessary to sense and respond to changing market dynamics. Yesterdays approaches to success at best have diminishing returns today and at worst put you at a competitive disadvantage.
Begin your journey as a campaign of exploration. You are finding the best path to success, and you will do it by ensuring that every solution you deploy is instrumented with sensors that help you identify the efficacy of the solution in real time. Real time data allows us to inductively identify patterns that form specific hypothesis. We then deductively test these hypotheses through comparatively low-cost solutions, the results of which help inform further induction. This circle of induction and deduction propels us through our journey to success.
July 11, 2019 | Posted By: Marty Abbott
Our clients often ask us the following question: “What is the role of the business analyst in Agile development?”
Spoiler Alert: There isn’t a role for the business analyst in Agile development.
For a longer answer, we need to explore the history of the Business Analyst (BA) role.
Traditional Business Analyst Role
Business analysts (BA) are typically found within an Information Technology (IT) organization, or adjacent to an IT organization (say within a business unit, working with IT). Here we use IT to designate the organization group within a company focused on producing or maintaining solutions that support back office business operations, employee productivity, etc.
Theoretically, the role focuses on analyzing a business domain, business processes or problem domain with the purpose of improving the domain or solving problems by defining systems or improvements to systems. The analyst then works with IT to implement the systems or improvements.
Practically speaking, the Business Analyst is often a bridge between what a business unit or domain “wants” and how that “want” should be implemented with a technical solution. This bridge is very often implemented through requirements specifications. The analyst then is responsible for writing and reviewing requirements and may be involved in some level of design to implement requirements. The Business Analyst is also often involved in validating requirements, evaluating the quality of an end solution and helping to usher the system through the appropriate sign-offs and training to launch the solution.
Business Analysts are very often found within waterfall development lifecycles where solutions move through “phases” in a linear fashion and where business owners and development teams are not integrated. They exist to solve a gap between independently operating information technology teams and business units.
Business Analysts in Agile
Teams practicing Agile should not have a need for someone with a Business Analyst title. One of the Agile principles is to have “Business people and developers [working] together daily throughout a project”. Within Scrum, the role for this daily interaction is most often through someone with the title of product owner. The product owner’s role is to optimize the value the entire team delivers through proper prioritization and expression of the product backlog. To be successful, the product owner must be properly empowered by the business organization he/she represents to achieve the product outcomes. As the name implies, he/she “owns” the product and associated business outcomes. Business analysts traditionally do not own either.
Given that the product owner is now responsible for a team of ideally no more than 12 people, one should not need an additional person “helping” with story writing, backlog prioritization, etc. Given the proximity of the product owner to the team doing the work and very little need for administrative help given the ratio of people on a team, there should be no need for a business analyst in an Agile team.
What Should I Do with My Business Analysts?
If you’ve read this far, you are probably a company transitioning from Waterfall to Agile development. The question is difficult to answer, and really depends upon the capabilities of the person in question.
Transitioning Business Analysts to Product Owners
We haven’t seen this be successful in many places. But on a case by case basis, it’s possible. If the person is smart, really understands the business, is empowered by the business unit he/she represents and is committed to understanding the role of the product owner then the person may be successful. Frankly, most business analysts have existed for too long as order takers to truly lead business initiatives within a development team.
Transitioning Business Analysts to Scrum Masters
We’ve seen greater success with this approach, but it requires a lot of training. To be successful in your conversion, you will need at least some number of highly trained and experienced Scrum Masters. If everyone is learning on the job, your transition to Agile will be slow and flawed. You can afford to have some (a handful) people transitioning from the BA role but be careful. The role of the Scrum Master and the role of the Business Analyst are very different. Some people won’t be able to make the transition, and others won’t have the desire.
Keeping Business Analysts in Waterfall Roles
Your company will no doubt continue to have several waterfall-oriented teams. Waterfall is appropriate anytime negotiation trumps collaboration (again, see the Agile Manifesto), as is the case in most projects involving systems integrators (I.e. back-office corporate systems or employee facing packaged solutions used by disparate functional teams). In fact, any contract-based development where outcomes are defined in a contract is ultimately a waterfall process even if the company has deluded itself into thinking the solution is “Agile”. Take your best business analysts and transition them into these waterfall projects.
Transitioning Business Analysts “Somewhere Else”
This is the catchall category. Perhaps some of them are recovering software engineers or infrastructure engineers and will want to go back. Maybe there is a place for great business analysts within a business unit working on non-technology related initiatives. In many companies, there may be waterfall projects where they can continue to add value. Lastly, you may no longer have a role for them.
We’ve sene many of our clients try and plug and play traditional IT roles with a simple name change to Agile terminology and then wonder why it isn’t working. Successful clients bring in proven Agile Coaches and spend time educating business leaders on how Agile applies throughout the organization. We’ve helped or consulted hundreds of companies on Agile transformations, give us a call, we can help.
July 11, 2019 | Posted By: Pete Ferguson
A few years back, a senior leader at a previous company decided that being “cutting edge” was a core value we should espouse and so we quickly became the Guinea Pigs of our vendors. I don’t recall them offering much, if any, of a discount, but they gladly pushed their V1 software to us and we were given a fast track feedback channel to the developers.
When employees weren’t able to get the needed access to our facilities and several service interruptions occurred, thankfully uptime became our top priority and we settled into V2 of stable release software instead.
Using proven mature technology means finding that sweet spot after the beta testers and before the end of a product development lifecycle (PDLC).
“How can we leverage the experiences of others and existing solutions to simplify our implementation? ... The simplest implementation is almost always one that has already been implemented and proven scalable.”
Abbott, Martin L.. Scalability Rules. Pearson Education.
At the release of the first and second generation of the Apple iPad, I was one of the crazy fanboys who woke up in the middle of the night to camp out in front of the Apple Store to be one of the first in line. Because the incremental improvements in speed and features were substantial, I would quickly want to upgrade to the next version.
In contrast, my kids still use Generation 3 and 4 iPads bought used on Craigslist and eBay, and the devices work great for basic web surfing and streaming videos. I waited until V2 of the iPad Pro for my own tablet which I also bought used and at a substantial discount. The incremental improvements of each new generation have become more and more arbitrary for how I use the devices, and the ROI for these latter devices is at least 2-3 fold greater.
In the world of Microsoft software, SP2 is usually that Goldilocks time period where there is enough benefit to upgrade without taking on too much risk for larger organizations. As of this writing, NetMarketShare.com reports that Windows 7 is still being used by 38% of polled users and Windows XP is not too far behind Windows 8 for a combined total of just under 8%. Clearly, many – including non-profit and third-world countries – want the stability of a proven OS without the hassle and cost to upgrade.
The Sharpness of Cutting Edge
It is helpful to consider the following when making a decision about just how close to the “cutting edge” technology utilized should be:
- Scalability: Will the technology support growth spurts without worry of capacity?
- Availability: Reliable in the 99.9s with a large enough customer base that if you are experiencing issues, so is a much larger segment of the market and the vendor is incentivized to quickly patch and resolve
- Competitive Advantage: Your company is propelled to success, not hindered by the technology
If your decision to be cutting edge is not providing an easily-observable competitive advantage, then the risks are not bringing the needed rewards.
As in all decisions, we need to constantly be asking what is our desired outcome? And then regularly review if our priorities match our purchase and implementation of technology decisions. Bells and whistles are often underutilized and provide very little competitive or strategic advantage while increasing costs and providing distractions.
TECHNOLOGY ADOPTION LIFECYCLE
Use Mature Technology – not retired or dying technology
The corollary, of course, is not to get too comfortable that your technology becomes antiquated.
As technology ages, vendors increase the cost of support in hopes of incentivizing us to upgrade to the latest and greatest or move to a subscription-based model. Perhaps the largest cost to our business, however, is the loss of competitive advantage when precious resources are used to nurse aged systems, attend P1/Sev 1 escalation calls, post mortems, and quality of service meetings instead of developing the next-gen platform that will maintain or beat the competition for market share.
The largest cost to our business is the loss of competitive advantage when precious resources are used to nurse aged systems instead of developing the next-gen platform to increase and maintain market share.
Don’t Hitch Your Wagon to a Dying or Dead Horse
As an extreme example of holding on to dying technology – many traditional banks, insurance companies, airlines, and others still use massive, monolithic applications written in COBOL. Outside of cost to change to a new platform, a justification we have heard from our clients to keep it alive is that COBOL programmers can make more because there are so few of them.
The higher pay, however, does not seem to drive new University interns and graduates who want to be part of the next Facebook, Google, or Apple to tie their future career to a dying language. The majority of COBOL programmers are currently retiring and will likely be out of the workforce within a decade which means the costs will continually increase until the supply is depleted. While it is not an immediately dire need to move out of COBOL, it definitely needs to be on the 3-5 year roadmap of how to transition to a more scalable, efficient, and easier to support language. For many organizations, it is like the ugly couch in a college apartment, no one really notices it is more than just an eyesore until a family of rats and lice are found to be living in it.
Unfortunately, with quarterly financial cycles, often the cost to move away from servers and software that have performed reliably for decades is a difficult cost proposition for operations teams. FAs – who’s bonus and annual increase are reliant upon keeping budgets flat – aren’t going to see the advantages. CTOs must take a longer term view of the cost to transition now verses down the road - and give a very heavy weighting to lost opportunity cost for not making the move sooner rather than later.
- Our technology solutions should always focus on what will lead to the most competitive take of market share
- Our technology must be both scalable and highly available
- There is a sweet spot to technology adoption that lies between early release and end of life that should be evaluated annually and the true cost of supporting beta and aged systems should be measured in both man hours for upkeep – but most importantly in how much it distracts our key talent from innovating the next BIG THING
We help hundreds of organizations globally find the right balance in how technology is utilized, and how technology ingrates with your people and process. Contact us today to see how we can help!
July 10, 2019 | Posted By: Bill Armelin
We are surprised at how often we go into a client and find that management does not have any metrics for their teams. The managers respond that they don’t want to negatively affect the team’s autonomy or that they trust the team to do the right things. While trusting your teams is a good thing, how do you know what they are doing is right for the company? How can you compare one team to another? How do you know where to focus on improvements?
Recently, we wrote an article about team autonomy, discussing how an empowered team is autonomous within a set of constraints. The article creates an analogy to driving a car, with the driver required to reach a specific destination, but empowered to determine WHAT path to take and WHY she takes it. She has gauges, such as a speedometer to give feedback on whether she is going too fast or too slow. Imagine driving a car without a speedometer. You will never know if you are sticking to the standard (the speed limit) or when you will get to where you need to go (velocity).
As a manager, it is your responsibility to set the appropriate metrics to help your teams navigate through the path to building your product. How can you hold your teams to certain goals or standards if you can’t tell them where they are in relation to the goal or standard today? How do you know if the actions you are taking are creating or improving shareholder value?
What metrics do you set for your teams? It is an important question. Years ago, while working at a Big 6 consulting firm, I had the pleasure of working with a very astute senior manager. We were redesigning manufacturing floors into what became Lean Manufacturing. He would walk into a client and ask them what the key metrics were. He would then proceed to tell them what their key issues were. He was always right. With metrics, you get what you measure. If you align the correct metrics with key company goals, then all is great. If you misalign them, you end up with poor performance and questionable behaviors.
So, what are the right metrics for a technology team? In 2017, we published an article on what we believe are the engineering metrics by which you should measure your teams. Some of the common metrics we focused on were velocity, efficiency, and cost. At initial glance, you might think that these seem “big brother-ish.” But, in reality, these metrics will provide your engineering teams with critical feedback to how they are doing. Velocity helps a team identify structural defects within the team (and should not be used to compare against other teams or push them to get more done). Efficiency helps the teams identify where they are losing precious development time to less valuable activities, such as meetings, interviews and HR training. It helps them and their managers quantify the impact of non-development and reduce such activities.
Cost helps the team identify how much they are spending on technology. We have seen this metric particularly used effectively in companies deploying to the cloud. Many companies allow cloud spending to significantly and uncontrollably increase as they grow. Looking at costs exposes things like the need for autoscaling to reduce the number of instances required during off peak times, or to purge unused instances that should be shut down.
The key to avoiding metrics from being perceived as overbearing is to keep them transparent. The teams must understand the purpose of the metric and how it is calculated. Don’t use them punitively. Use them to help the teams understand how they are doing in relation to the larger goals. How do you align the higher-level company goals to the work you teams are performing? We like to use Objectives and Key Results, or OKRs. This concept was created by Andy Grove at Intel and brought to Google by John Doerr. The framework aims to align higher level “objectives” to measurable “key results.” An objective at one level has several key results. These key results become the objectives for the next level down and defines another set of key results at that level. This continues all the way down to the lowest levels of the company resulting in alignment of key results and objectives across the entire company.
Choosing the Right MetricMetrics-driven institutions demonstrably outperform those that rely on intuition or “gut feel.” This stated, poorly chosen metrics or simply too many metrics may hinder performance.
- A handful of carefully chosen metrics. Choose a small number of key metrics over a large volume. Ideally, each Agile team should be evaluated/tasked with improving 2-3 metrics (no more than 5). (Of note, in numerous psychological studies, the quality of decision-making has actual been shown to decrease when too much information is presented).
- Easy to collect and or calculate. A metric such as “Number of Customer Service Tickets per Week” although crude, is better than “Engineer Hours spent fixing service” as it requires costly time/effort to collect.
- Directly Controllable by the Team. Assigning a metric such as “Speed and Accuracy of Search” to a Search Service is preferred to “Overall Revenue” which is less directly controllable.
- Reflect the Quality of Service. The number of abandoned shopping carts reflects the quality of a Shopping Cart service, whereas number of shopping cart views is not necessarily reflective of service quality.
- Difficult to Game. The innate human tendency to game any system should be held in check by selecting the right metrics. Simple velocity measures are easily gamed while the number of Sev 1 incidents cannot be easily gamed.
- Near Real Time Feedback. Metrics that can be collected and presented over short-time intervals are most desirable. Information is most valuable when fresh — Availability week over week is better than a yearly availability measure.
Managers are responsible for the performance of their teams in relation to the company’s objectives and how they create shareholder value. Measuring how your teams are performing against or their contribution to those goals is only speculation if you don’t have the correct measurements and metrics in place. The bottom line is, “If you are not measuring, you are not managing.”
Are you having difficult defining the right metrics for your teams? Are you interested in defining OKRs but don’t know where or how to get started? AKF has helped many companies identify and implement key metrics, as well as implementing OKRs. We have over 200 years of combined experience helping companies ensure their organizations, processes, and architecture are aligned to the outcomes they desire. Contact us, we can help.
July 8, 2019 | Posted By: Marty Abbott
Circuit Breaker Pattern Overview
The microservice Circuit Breaker pattern is an automated switch capable of detecting extremely long response times or failures when calling remote services or resources. The circuit breaker pattern proxies or encapsulates service A making a call to remote service or resource B. When error rates or response times exceed a desired threshold, the breaker “pops” and returns an appropriate error or message regarding the interface status. Doing so allows calls to complete more quickly, without tying up TCP ports or waiting for traditional timeouts. Ideally the breaker is “healing” and senses the recovery of B thereby resetting itself.
The circuit breaker analogy works well in that it protects a given circuit for calls in series. Unfortunately, it misses the true analogy of tripping to protect the propagation of a failure to other components on other circuits. We often use the term circuit breaker in our practice to refer to either the technique of fault isolation or the microservice pattern of handling service to service faults. In this article, we use the circuit breaker consistent with the microservice meaning.
Problems the Circuit Breaker Fixes
Generally speaking, we consider service to service calls to be an anti-pattern to be avoided whenever possible due to the multiplicative effect of failure and the resulting lowering of availability. There are, however, sometimes that you just can't get around making distant calls. Examples are:
- Resource (e.g. database) Calls: Necessary to interact with ACID or NoSQL Solutions.
- Third Party Integrations: Necessary to interact with any third party. While we prefer these to be asynchronous, sometimes they must be synchronous.
In these cases, it makes sense to add a component, such as the circuit breaker, to help make the service more resilient. While the breaker won't necessarily increase the availability of the service in question, it may help reduce other secondary and tertiary problems such as the inability to access a service for troubleshooting or restoration upon failure.
Principles to Apply
- Avoid the need for circuit breakers whenever possible by treating calls in series as an anti-pattern.
- When calls must be made in series, attempt to use an asynchronous and non-blocking approach.
- Use the circuit breaker to help speed recovery and identification of failure, and free up communication sockets more quickly.
When to use the Circuit Breaker Pattern
- Useful for calls to resources such as databases (ACID or BASE).
- Useful for third party synchronous calls over any distance.
- When internal synchronous calls can't otherwise be avoided architecturally, useful for service to service calls under your control.
The circuit breaker won't fix availability problems resulting from a failed service or resource. It will make the effects of that failure more rapid which will hopefully:
- Free up communication resources (like TCP sockets) and keep them from backing up.
- Help keep shared upstream components (e.g. load balancers and firewalls) from similarly backing up and failing.
- Help keep the failed component or service accessible for more rapid troubleshooting and alerting.
- Always ensure to have alerts fired on breaker open situations to help aid in faster time to detect (TTD).
AKF Partners has helped hundreds of companies implement new microservice architectures and migrate existing monolithic products to microservice architectures. Give us a call – we can help!
July 3, 2019 | Posted By: Marty Abbott
AKF Partners has helped guide companies through digital transformations for over 10 years. We’ve helped traditional brick and mortar service, product, banking and retail companies create compelling sail solutions to harness the increasing power of the prevailing digital winds. This experience has made it clear that no transformation can be successful without addressing 10 key areas. These 10 areas form the foundation of any successful transformation, and a failure to address any of them is at the very least a guarantee to have a slow and painful transformation. In most cases, a failure to address even one of them is a guarantee to fail in the transformation and as a result, fail as a company.
Without further ado, we offer our list of 10 must have principles for any digital transformation.
Every successful transformation starts and ends with people; people who have the right experience, the right mindset, the right approach, and a sizable amount of humility.
- Right Skills, Right Behaviors, Right Experience, Right Time
You need people with experience in the digital world; people who understand that time is of the essence and behave appropriately. Think of it this way: if you were having a surgeon perform a procedure to save your life, do you want a surgeon who is learning on the job? At the very least, you’ll want to make sure that an experienced surgeon is alongside the inexperienced doctor. The same is true for digital transformations.
- Product not IT Mindset
Building solutions for end consumers is a very different world than building solutions for employees. As we indicate in the Art of Scalability, and as Marty Cagan agrees, you must have a product, not an IT mindset, to be successful.
The driving forces behind how one creates a product are different. Product teams look at revenues and profits instead of just costs. Funding outcomes are more important than funding projects. Product teams lead whereas IT teams take orders. Product teams think first about performance, rather than speaking nonstop about process. Governance, while important, takes a back seat to execution – especially executing against measurable outcomes. Collaboration trumps the negotiation between IT teams and business units.
Similarly, the leader of a product engineering organization is different than the leader of an IT team:
As we will discuss later under “discovery”, product teams know that they will be wrong often. We eliminate words like “requirements”, because they seem to indicate we completely understand what needs to happen. Humble teams understand the outcome, not the path. The path is initially a set of hypotheses that are tested, validated, or proven wrong and discarded in favor of new hypotheses. The only thing we know is that we will be wrong multiple times before finally landing on a result that meets the business outcomes.
- Outcomes not Projects
As stated above, Digital teams look to fund outcomes – not projects. Think in terms of $XM invested in search for %Y increase in add-to-carts from search resulting in $ZM revenue. Not “Implement Elastic Search”.
The right people will ensure that you follow the processes and approaches that will make you successful – specifically:
- Discovery and Agility
Digital companies attempt to find the right solution through Discovery. Starting with an MVP (below), they iterate in an Agile fashion to find the solution. Gone are large solution specifications (software requirements specifications) in favor of epics and stories which can be clearly and easily defined in significantly fewer words.
Product teams understand what Fred Brooks meant when he said
Because the design that occurs first is almost never the best possible, the prevailing system concept may need to change. Therefore, flexibility of organization is important to effective design.
Fred is talking about the need to be agile – to sense and respond to not only the mistakes made with any solution development, but the interaction of the market with the solution you develop. You must throw away waterfall concepts in your digital endeavors to be truly successful.
When you develop solutions for employees, you have a captive audience and a natural monopoly. You are paying them, and you get to determine whether they use your solution or not. As a result, you don’t have to be great at solution usability. That isn’t the case in the new digital world. End users will churn or select another provider if a solution isn’t easy to use and intuitive.
- Gas Pedals – Not Brakes
This is a catch-all category for all the reasons traditional IT teams have for why something can’t be done: “We have to work the process ...” or “This needs to go through a review ...” or “Has it gone through the right governance processes?” or “Have you filed a ticket for this yet?” Product teams care about speed and time to market. As such, the best product teams have all the skills in them, and the correct experience within the team to be fully empowered and held accountable to achieving the right outcomes.
Everyone is using the term minimum viable product (MVP) these days. Few companies truly get it. The pork belly political negotiation that typically happens to get an IT project launched results in bloated, overpriced and slow time to market solutions. Digital companies know that small is fast. They know that they’d rather build smaller than the initial need and work into a true MVP than over-shoot it and as a result be late.
The era of a company relying on the brilliance of a handful of people to predict markets is over. Companies that have completed the digital transformation sense the needs of the market and customers in real time through science – not individual brilliance.
- Learning not Knowing (Induction and Deduction)
Digital teams assume that their hypotheses may be wrong, and they know that what may be correct today for a solution will likely evolve and change soon. As such, they build solutions that help them identify evolving patterns and learn true end user need. Critical to any Digital transformation is a data ecosystem that goes well beyond the packaged “data warehouses” of yore. You need more than just a place to dump your data – you need a solution that supports a virtuous cycle of learning and exploration. Induction leads to insights to form hypotheses. Deduction proves (or disproves) those hypotheses to create new knowledge that fuels growth.
- Scientists and Engineers – not Technicians
Reports and reporting teams may provide executives with the daily pulse of their business, but they are insufficient to fuel the insights necessary to be competitive in the digital world. Report writers are at best programmers, and you need people who understand how to use data to that generate insights and result in knowledge and information. More than just trends, Digital teams need to understand how and why trends happen and even more importantly must be able to predict what the future brings.
Need help with your digital transformation? Contact us, we can help!
July 3, 2019 | Posted By: James Fritz
For many years after the introduction of the automobile, most industries benefited from somewhat static and predictable secular forces. Nascent technology forces primarily aided companies in increasing gross margins and operating margins by lowering the cost of labor, decreasing the cost of warehousing and decreasing logistical costs. Consumer behavior followed somewhat predictable and unchanged patterns, varying mostly with economic conditions and seasonal needs.
However, the advent of eCommerce and the “anything as a service” movement in the late 90’s through early 2000’s started to significantly change the behavior of individual and business consumers. Layer on logistics integrations that allowed near immediate gratification for non-service durable goods and consumers started to shift to purchasing from home. Similarly, businesses enjoy comparatively easy near-term gratification with the implementation of services; gone are the days of multi-year on-premise ERP implementations in favor of short-duration leasing of software as a service.
Late majority and laggard businesses within the technology life cycle were as late to identify these shifts in business and individual consumer behavior as they were to adopt new technical solutions. While they may have thrown up digital storefronts or offered digital downloads of their solutions, they failed to envision how the nascent forces begged for new integrations and tighter business cycles that would benefit not only the buyer but the producing company as well.
So, what is Digital Transformation? It is taking digital technologies and using them to provide new and creative ways to conduct business. This isn’t just an update to a technology stack or developing a new code. It is a transformation, via digitization, of how business is done.
Recaptcha and the NY Times
In the mid 2000’s Captcha was taking the internet by storm. It was designed to weed out bots and only allow humans through websites. It started out rather simply with a challenge of words or letters that required feedback that, at the time, would be difficult for a computer to detect. It met with great success, logging over half a million hours per day of people filling out captchas to access sites or submit information. Although not a digital transformation itself, what was coupled with captcha became a transformation.
The NY Times had digitized all of their old issues and used computer software to recognize the different images and convert to text. However, 10 to 30 percent of the text was still missed. This then required two humans to go through the unrecognized text and come to the same conclusion about what was shown. A lengthy and expensive process. Partnering with the captcha team, the NY Times put those unrecognized words at the end of captchas. There would be a normal captcha question (still designed to determine whether a bot or not) followed by a picture of text from the NY Times. Once the picture had been verified by enough people, that picture would then be associated with that text. Using this resource of over half a million man hours per day, the NY Times was able to quickly close the gap on unrecognized text from older issues. This generated an increase in business for captcha and solved a manual problem digitally.
Who uses Digital Transformation and Why?
Startups are created digitally. They have no infrastructure or code or business model that existed previously. Instead of transformation, startups are born digital. This gives them the opportunity to carve out a piece of the total addressable market currently not being serviced, or as a way to siphon off business from other companies currently operating in the market. By looking at what is already being done in the market and identifying the dissatisfaction within in, smart engineers can create a new product and a new business (if done appropriately, well-funded as well) that dramatically transforms a technology sector. Digitization can come easy to startups. The business already exists and trying to compete by doing the same thing will lead to failure. Taking an antiquated method and using digital prowess to transform it is what gives startups their edge.
If the current competitors in the market are paying attention, they can use this disruption to their advantage. Just like startups were able to glean information about where and how to target from larger corporations, those same corporations can now identify how to transform from the startups. It is not always easy creating a disruptive technology that requires a transformation for how business is done. By being a close follower, large corporations can avoid a lot of the pitfalls and ensure that they can keep their market share, while also now targeting new consumers. Conversely, if unwilling to adapt to changing forces, or unwilling to see the future for what it is, these corporations will be abandoned by their consumers.
Digital Transformation is Not…
...just taking a current generation technology and updating it.
...version control or the introduction of new hardware or software.
Digital Transformation requires a complete re-tooling of how you conduct business. It will affect how you code, how you scale, how you sell, how you brand and even how you interact as a company. A caterpillar doesn’t become a butterfly simply by slapping on some wings. Inside the cocoon it completely dissolves itself and rebuilds from the ground up. That is Digital Transformation.
Digital Transformation isn’t easy. Marty Abbott has summed up 10 Principles that will assist you with your transformation. If you need further assistance with identifying Digital Transformation pitfalls and goals, AKF can help!
July 1, 2019 | Posted By: Greg Fennewald
As technology professionals, managing risk is an important part of the value we provide to the business. Risk can take many forms, including threats to availability, scalability, information security, and time to market. Physical layer risks from the data center realm can severely impact availability, as the events of the February 2019 Wells Fargo outage demonstrate.
Transitioning Away from On-Prem Hosting
Over the last decade, knowledge of data center architecture, operating principles, capabilities, and associated risks has decreased in general due to the rise of managed hosting and especially cloud hosting. This is particularly true for small and medium sized companies, which may have chosen cloud hosting early on and thus never have dealt with colocation or owned data centers. This is not necessarily a bad trend – why devote resources to learn domains that are not core to your value proposition?
While knowledge of data center geekdom may have decreased, the risks associated with data centers has not substantially changed. Even the magic pixie dust of cloud hosting is a data center at its core, albeit with a degree of operational excellence exceeding the stereotypical company-owned data center + colo combination.
Given that technologists can mitigate data center risks by choosing cloud hosting with a major provider capable of mastering data center operations, why spend any time to learn about data center risks?
- Cloud hosting sites do encounter failures. The ability to ask informed questions during the vendor selection process can help optimize the availability for your business.
- Business or regulatory changes may force a company to use colocation to meet data residency or other requirements.
- A company may grow to the size where owning data centers makes business sense for a portion of their hosting need.
- A hosting provider could exit the business or face bankruptcy, forcing tenants to take over or move on short notice. Been there, done that, got the T shirt.
Data Center Lifespan Risk
For the purposes of this article, we will consider data center lifespan risk. We define this risk as the probability of an infrastructure failure causing significant, and possibly complete, business disruption and the level of difficulty in restoring functionality.
A chart of data center lifespan risk resembles a bathtub – a high level of failures as the site is first built and undergoing 5 levels of commissioning towards the left side of the chart, followed by a long period of lower threat that can extend 15 years or more. As time continues to march on, the risk rises again, creating the right-hand side of the bathtub curve.
The risk of failure increases over time as the useful service life of infrastructure components approach their end. The risk of failure approaches unity over a sufficiently long-time span.
Service Life Examples
Below are some service life estimates based on our experience for critical data center components that are properly maintained;
||4 years VRLA, 12+ wet cell
||Battery string monitoring strongly recommended
||12,000+ hours before overhaul, run 100 or less annually
|Main switchgear PLC
||15 + years
||PLC model EOL is the risk
|CRAH/CRAC fan motors
||The magic smoke wants to escape
|Galvanized cooling tower wet surfaces
||Varies with water chemistry, stainless steel worth the cost
|Electrical distribution board
||EOL of breaker style and PLC is the risk
|Chilled water piping
||Design for continuous duty, ~ 7 FPS flow velocity
All the above examples are measured in years. If you are in the early years of a data center lifespan, there’s not a lot to worry about other than batteries. Most growing companies are more concerned about adequate capacity, availability, and cost when they create their hosting strategy. Not much thought is given to an exit strategy. Such an effort is probably not worth it for a startup company, but established companies need to be thinking beyond next quarter and next year.
If your product or service can survive the loss of a single hosting site without impact (i.e. multi-active sites with validated traffic shifts), you could afford to run a bit deeper into the service life timeline. If you can’t - or, like Wells Fargo thought you could but learned the hard way that was not the case - you need to plan ahead to mitigate these risks.
As mentioned before, the risks we want to mitigate are an impactful failure and a complex restoration after a failure. By complex, we mean trying to find parts and trained technicians for components that were EOL 5 years ago and end of OEM support 18 months ago. Not a fun place to be. Would you feel comfortable running your online business with switches and routers that are EOL and EOS? Hopefully not. Why would you do so for your hosting location?
Mitigating the Risks
The best way to mitigate the risk of an impactful infrastructure failure is to be able to survive the loss of a hosting site regardless of type with business disruption that is acceptable to the business and customers. That could vary, your hosting solution should be tailored to the needs of the business.
Some thoughts on aging hosting sites;
- All the characteristics that make cloud hosting taste great and be less filling (containerization, automation, infrastructure as code, orchestration, etc.) can also make the effort to stand up a new site and exit an old one much less onerous.
- If you are committed to an owned data center or colo, moving to a newer site is the best choice. Could you combine a move with a tech refresh cycle? Could the aging data center fulfill a different purpose such as hosting development and QA environments? Such environments should have less business impact from a failure, and you can squeeze out the last few years of life from that site.
- You can purchase extra spare parts for components nearing EOL or EOS and send technicians to training courses. This can mitigate risk but is really analogous to convincing yourself that you can scale your DB by tuning the SQL queries. Viable only to add 6 or 12 months to a move/exit timeline.
Just about any of the components mentioned above in the useful life estimate can be replaced, especially if the data center can be shut down for weeks or months to make the replacement and test the systems. Trying to replace components while still serving traffic is extremely risky. Very few data centers have the redundancy to replace electrical components while still providing conditioned power and cooling to the server rooms. Those sites that can usually cannot do so without reducing their availability. We’ve had to take a dual UPS (2N) site to a single UPS source (N) for a week to correct a serious design flaw. Single corded is not appropriate if your DR plan checks an audit box and not much else
The tremendous popularity of cloud hosting does not alleviate the need to understand physical layer risks, including data center lifespan risks. Understanding them enables technology leaders to mitigate the risks.
Interested in learning more? Need assistance with hosting strategy? Considering a transition to SaaS? AKF Partners can help.
< 1 2 3 4 > Last ›