September 10, 2018 | Posted By: Robin McGlothin
The Scalability Cube – Your Guide to Evaluating Scalability
Perhaps the most common question we get at AKF Partners when performing technical due diligence on a company is, “Will this thing scale?” After all, investors want to see a return on their investment in a company, and a common way to achieve that is to grow the number of users on an application or platform. How do they ensure that the technology can support that growth? By evaluating scalability.
Let’s start by defining scalability from the technical perspective. The Wikipedia definition of “scalability” is the capability of a system, network, or process to handle a growing amount of work, or its potential to be enlarged to accommodate that growth. That definition is accurate when applied to common investment objectives. The question is, what are the key attributes of software that allow it to scale, along with the anti-patterns that prevent scaling? Or, in other words, what do we look for at AKF Partners when determining scalability?
While an exhaustive list is beyond the scope of this blog post, we can quickly use the Scalability Cube and apply the analytical methodology that helps us quickly determine where the application will experience issues.
AKF Partners introduced the scalability cube, a scale design model for building resilience application architectures using patterns and practices that apply broadly to any application. This is a best practices model that describes all scale dimensions from “The Art of Scalability” book (AKF Partners – Abbot, Keeven & Fisher Partners).
The “Scale Cube” is composed of an X-Axis, Y-Axis, and Z-Axis:
1. Technical Architectural Layering (X-Axis ) – No single points of failure. Duplicate everything.
2. Functional Decomposition Segmentation – Componentization to Modules & Microservices (Y-Axis). Split Report, Message, Locate, Forms, Calendar into fault isolated swim lanes.
3. Horizontal Data Partitioning - Shards (Z-Axis). Beginning with pilot users, start with “podding” users for scalability and availability.
The Scale Cube helps teams keep critical dimensions of system scale in mind when solutions are designed. Scalability is all about the capability of a design to support ever growing client traffic without compromising performance. It is important to understand there are no “silver bullets” in designing scalable solutions.
An architecture is scalable if each layer in the multi-layered architecture is scalable. For example, a well-designed application should be able to scale seamlessly as demand increases and decreases and be resilient enough to withstand the loss of one or more computer resources.
Let’s start by looking at the typical monolithic application. A large system that must be deployed holistically is difficult to scale. In the case where your application was designed to be stateless, scale is possible by adding more machines, virtual or physical. However, adding instances requires powerful machines that are not cost-effective to scale. Additionally, you have the added risk of extensive regression testing because you cannot update small components on their own. Instead, we recommend a microservices-based architecture using containers (e.g. Docker) that allows for independent deployment of small pieces and the scale of individual services instead of one big application.
Monolithic applications have other negative effects, such as development complexity. What is “development complexity”? As more developers are added to the team, be aware of the effects suffering from Brooks’ Law. Brooks’ law states that adding more software developers to a late project makes the project even later. For example, one large solution loaded in the development environment can slow down a developer and gets worse as more developers add components. This causes slower and slower load times on development machines, and developers stomping on each other with changes (or creating complex merges) as they modify the same files.
Another example of development complexity issue is large outdated pieces of the architecture or database where one person is an expert. That person becomes a bottleneck to changes in a specific part of the system. As well, they are now a SPOF (single point of failure) if they are the only resource that understands the monolithic beast. The monolithic complexity and the rate of code change make it hard for any developer to know all the idiosyncrasies of the system, hence more defects are introduced. A decoupled system with small components helps prevents this problem.
When validating database design for appropriate scale, there are some key anti-patterns to check. For example:
• Do synchronous database accesses block other connections to the database when retrieving or writing data? This design can end up blocking queries and holding up the application.
• Are queries written efficiently? Large data footprints, with significant locking, can quickly slow database performance to a crawl.
• Is there a heavy report function in the application that relies on a single transactional database? Report generation can severely hamper the performance of critical user scenarios. Separating out read-only data from read-write data can positively improve scale.
• Can the data be partitioned across different load databases and/or database servers (sharding)? For example, Customers in different geographies may be partitioned to various servers more compatible with their locations. In turn, separating out the data allows for enhanced scale since requests can be split out.
• Is the right database technology being used for the problem? Storing BLOBs in a relational database has negative effects – instead, use the right technology for the job, such as a NoSQL document store. Forcing less structured data into a relational database can also lead to waste and performance issues, and here, a NoSQL solution may be more suitable.
We also look for mixed presentation and business logic. A software anti-pattern that can be prevalent in legacy code is not separating out the UI code from the underlying logic. This practice makes it impossible to scale individual layers of the application and takes away the capability to easily do A/B testing to validate different UI changes. Layer separation allows putting just enough hardware against each layer for more minimal resource usage and overall cost efficiency. The separation of the business logic from SPROCs (stored procedures) also improves the maintainability and scalability of the system.
Another key area we dig for is stateful application servers. Designing an application that stores state on an individual server is problematic for scalability. For example, if some business logic runs on one server and stores user session information (or other data) in a cache on only one server, all user requests must use that same server instead of a generic machine in a cluster. This prevents adding new machine instances that can field any request that a load balancer passes its way. Caching is a great practice for performance, but it cannot interfere with horizontal scale.
Finally, long-running jobs and/or synchronous dependencies are key areas for scalability issues. Actions on the system that trigger processing times of minutes or more can affect scalability (e.g. execution of a report that requires large amounts of data to generate). Continuing to add machines to the set doesn’t help the problem as the system can never keep up in the presence of many requests. Blocking operations exasperate the problem. Look for solutions that queue up long-running requests, execute them in the background, send events when they are complete (asynchronous communication) and do not tie up key application and database servers. Communication with dependent systems for long-running requests using synchronous methods also affects performance, scale, and reliability. Common solutions for intersystem communication and asynchronous messaging include RabbitMQ and Kafka.
Again, the list above is not exhaustive but outlines some key areas that AKF Partners look for when evaluating an architecture for scalability. If you’re looking for a checklist to help you perform your own diligence, feel free to use ours. If you’re wondering more about our diligence practice, you may be interested in our thoughts on best practices, or our beliefs around diligence and how to get it right. We’ve performed technical diligence for seed rounds, A-series and beyond, carve-outs, strategic investments and taking public companies private. From $5 million invested to over $1 billion. No matter the size of company or size of the investment, we can help.
September 6, 2018 | Posted By: James Fritz
“An incident is a terrible thing to waste” is a common mantra that AKF repeats during its Engagements. And rightfully so as many companies have an incident response plan in place but stop there. Why are incidents so important? What is the true value in doing a proper Post Mortem and actually learning from an incident?
Incidents identify issues in your product. But if that is all you take out of an incident then you are missing out on so much more information that an incident can provide. An incident is the first step to identifying a problem that exists in your product, infrastructure processes, and perhaps, people. “But aren’t incidents and problems the same thing?” Not necessarily. An incident is a one time event. It can occur multiple times if you never address the problem, but it is not isolated.
Conducting a Post Mortem
Gather as many data points as possible shortly after an incident concludes and schedule a Post Mortem review meeting.
Start with the incident timeline. Sufficiently logging events over time provides ready access to the needed data for forensic analysis. From this information you can then start to identify what went wrong, when it went wrong and how quickly you were able to respond to it. The below definitions are all factors that need to be identified:
- Time To Detect: How quickly did you identify that an incident had occurred
- Time To Escalate: How quickly did you get everyone necessary to fix the incident involved
- Time To Isolate: How quickly did you stop the incident from affecting other portions
- Time To Restore: How quickly did the system get brought back up
- Time To Repair: How quickly did you fix the incident
This all leads to the Incident Timeline Analysis.
If you can gather information from several incidents and look at them in your Post Mortem review, then you can figure out where your biggest issues are when it comes to incidents and getting the system back up and running. It is not uncommon for us to see that it often takes longer to detect the incident than to restore from it. This could be mitigated with more monitoring at more appropriate positions then you currently have.
Or maybe the time to escalate is an issue. Why does it take so long to get the proper engineers involved? Maybe a real-time alert system is required or a phone tree. And it is important to track and measure total time of an incident as beginning with when it occured (not when it was reported) all the way through to when customers were back up at 100% (not just when your systems were restored).
Problem vs. Incident
How do you know if your incident is also a problem? It’s actually fairly easy to determine. If you have an incident, you have a problem. The scale of the problem may vary by incident but every incident is caused because of something larger than itself.
During our Technical Due Diligences we always want to know how companies categorize incidents vs. problems. If the company properly categorizes problems related to incidents, they will be able to answer “Can you rank your problems to show which cause the most customer impact?” Many times, they can’t - but that ranking is critical to show which problems to attack first.
An incident, at its core, is caused by a problem. If your product crashes anytime someone attempts to access it via an unapproved protocol, the incident is the attempted access. The problem may be an improper review of your architecture. Or it may be lack of QA. Identifying the problem is much more difficult than identifying the incident. Imagine you find a termite on your deck. This small pest could be considered an incident. If you deal with the incident and get rid of the termite everything is good, right? If you don’t look any further than the incident you can’t identify the problem. And in this case the problem could be exposed, untreated wood allowing termites to slowly eat away at the inside of your house.
If you are keeping proper documentation each time you conduct a Post Mortem review, then you will have a history that will start to paint of a picture of ongoing and recurring problems that exist. Remedying the problem will stop the incident from occurring in the same exact way in the future. But small variations of the incident can still occur. If you fix the problem then you are stopping future iterations of that incident from happening again.
Looking for additional help with your incident management process? We can help!
September 6, 2018 | Posted By: James Fritz
In our experience we have seen how Agile practices provide organizations within successful companies many benefits which is leading to more and more companies adopting frameworks of Agile outside of software development. Whether they are looking for reduced risk, higher product quality, or even the capability to “fail fast” and rectify mistakes, Agile provides many benefits, particularly in management.
While effort has been expended to identify how to create Agile product delivery teams (Organizing Product Teams for Innovation) and conversely why they fail (The Top Five Most Common Agile PDLC Failures) – a lot of the focus is on the successes and failures of the delivery teams themselves. But the delivery is only as good as the group that surrounds that team.
So how does Agile work beyond your delivery teams? An essay published in 1970 by Robert K. Greenleaf, The Servant as Leader, is credited with introducing the idea of a Servant-Leader, someone who puts their employees’ needs ahead of their own. This is counter-intuitive to a normal management style where management has a list of needs that require completion.
Looking at an Agile team, the concept of waiting for management to drive needs is not conducive to meeting the requirements of the market. A highly competent Agile team has all the necessary tools and authority to get the job done that is required of them. If normal management tactics sit over an Agile team, failure is going to occur.
This is where the philosophy of Servant-Leadership comes into play. If managers, all the way to the C-Suite, understand that they work for their employees, but their employees are accountable to them, then everyone is working towards one goal: the needs of the market. Management needs to be focused on securing the resources necessary for product delivery teams to meet the demands of the market, whether from a high level of the CEO and CFO for additional funding or further down with ensuring that technical debt and other tasks are assigned out appropriately to meet delivery goals. This empowerment for teams may seem risky, but the morale improvement and greater innovation that can be achieved far exceeds the level of risk that would be accepted.
Embracing Agile throughout a company is key to the company being able to survive beyond the first couple sprints. Small changes in management can play a huge role in that. Asking simple questions like, “what do you need to meet your goals”, or “what factors stand in your way of accomplishment” help to enable employees instead of limiting them. Asking yourself why you are successful as a company also helps to identify what segment is responsible for your success.
If the delivery of your services is what customers buy, then identifying ways to enable employees who create those services is vital. This isn’t to say that other roles in the company aren’t important. Without support from the entire company, no one particular segment can succeed. This is why it is so vital for Agile to permeate throughout your entire organization. If you need assistance in identifying gaps in Agile and figuring out how to employ it, reach out to AKF.
September 5, 2018 | Posted By: Pete Ferguson
Scalability doesn’t somehow magically appear when you trust a cloud provider to host your systems. While Amazon, Google, Microsoft, and others likely will be able to provide a lot more redundancy in power, network, cooling, and expertise in infrastructure than hosting yourself – how you are set up using their tools is still very much up to your budget and which tools you choose to utilize. Additionally, how well your code is written to take advantage of additional resources will affect scalability and availability.
We see more and more new startups in AWS or Azure – in addition to assisting well-established companies make the transition to the cloud. Regardless of the hosting platform, in our technical due diligence reviews we often see the same scalability gaps common to hosted solutions written about in our first edition of “Scalability Rules.” (Abbott, Martin L.. Scalability Rules: Principles for Scaling Web Sites. Pearson Education.)
This blog is a summary recap of the AKF Scale Cube (much of the content contains direct quotes from the original text), an explanation of each axis, and how you can be better prepared to scale within the cloud.
Scalability Rules – Chapter 2: Distribute Your Work
Using ServiceNow as an early example of designing, implementing, and deploying for scale early in its life, we outlined how building in fault tolerance helped scale in early development – and a decade + later the once little known company has been able to keep up with fast growth with over $2B in revenue and some forecasts expecting that number to climb to $15B in the coming years.
So how did they do it? ServiceNow contracted with AKF Partners over a number of engagements to help them think through their future architectural needs and ultimately hired one of the founding partners to augment their already-talented engineering staff.
“The AKF Scale Cube was helpful in offsetting both the increasing size of our customers and the increased demands of rapid functionality extensions and value creation.”
~ Tom Keevan (Founding Partner, AKF Partners)
The original scale cube has stood the test of time and we have used the same three-dimensional model with security, people development, and many other crucial organizational areas needing to rapidly expand with high availability.
At the heart of the AKF Scale Cube are three simple axes, each with an associated rule for scalability. The cube is a great way to represent the path from minimal scale (lower left front of the cube) to near-infinite scalability (upper right back corner of the cube). Sometimes, it’s easier to see these three axes without the confined space of the cube.
X Axis – Horizontal Duplication
The X Axis allows transaction volumes to increase easily and quickly. If data is starting to become unwieldy on databases, distributed architecture allows for reducing the degree of multi-tenancy (Z Axis) or split discrete services off (Y Axis) onto similarly sized hardware.
A simple example of X Axis splits is cloning web servers and application servers and placing them behind a load balancer. This cloning allows the distribution of transactions across systems evenly for horizontal scale. Cloning of application or web services tends to be relatively easy to perform and allows us to scale the number of transactions processed. Unfortunately, it doesn’t really help us when trying to scale the data we must manipulate to perform these transactions as memory caching of data unique to several customers or unique to disparate functions might create a bottleneck that keeps us from scaling these services without significant impact on customer response time. To solve these memory constraints we’ll look to the Y and Z Axes of our scale cube.
Y Axis – Split by Function, Service, or Resource
Looking at a relatively simple e-commerce site, Y Axis splits resources by the verbs of signup, login, search, browse, view, add to cart, and purchase/buy. The data necessary to perform any one of these transactions can vary significantly from the data necessary for the other transactions.
In terms of security, using the Y Axis to segregate and encrypt Personally Identifiable Information (PII) to a separate database provides the required security without requiring all other services to go through a firewall and encryption. This decreases cost, puts less load on your firewall, and ensures greater availability and uptime.
Y Axis splits also apply to a noun approach. Within a simple e-commerce site data can be split by product catalog, product inventory, user account information, marketing information, and so on.
While Y axis splits are most useful in scaling data sets, they are also useful in scaling code bases. Because services or resources are now split, the actions performed and the code necessary to perform them are split up as well. This works very well for small Agile development teams as each team can become experts in subsets of larger systems and don’t need to worry about or become experts on every other part of the system.
Z Axis – Separate Similar Things
Z Axis splits are effective at helping you to scale customer bases but can also be applied to other very large data sets that can’t be pulled apart using the Y Axis methodology. Z Axis separation is useful for containerizing customers or a geographical replication of data. If Y Axis splits are the layers in a cake with each verb or noun having their own separate layer, a Z Axis split is having a separate cake (sharding) for each customer, geography, or other subset of data.
This means that each larger customer or geography could have its own dedicated Web, application, and database servers. Given that we also want to leverage the cost efficiencies enabled by multitenancy, we also want to have multiple small customers exist within a single shard which can later be isolated when one of the customers grows to a predetermined size that makes financial or contractual sense.
For hyper-growth companies the speed with which any request can be answered to is at least partially determined by the cache hit ratio of near and distant caches. This speed in turn indicates how many transactions any given system can process, which in turn determines how many systems are needed to process a number of requests.
Splitting up data by geography or customer allows each segment higher availability, scalability, and reliability as problems within one subset will not affect other subsets. In continuous deployment environments, it also allows fragmented code rollout and testing of new features a little at a time instead of an all-or-nothing approach.
This is a quick and dirty breakdown of Scalability Rules that have been applied at thousands of successful companies and provided near infinite scalability when properly implemented. We love helping companies of all shapes and sizes (we have experience with development teams of 2-3 engineers to thousands). Contact us to explore how we can help guide your company to scale your organization, processes, and technology for hyper growth!
September 4, 2018 | Posted By: Dave Swenson
AKF has been kept quite busy over the last decade helping companies move from an on-premise product to a SaaS service - often one of the most difficult transitions a company can face. We have found that the following are the top mistakes made during that migration.
With apologies to David Letterman…
The Top 5 SaaS Migration Mistakes
5. Treat Your SaaS Migration Only as a Marketing Exercise
Wall Street values SaaS companies significantly higher than traditional software companies, typically double the revenue multiples. A key reason for this is the improved margins the economies of scale true SaaS companies gain. If you are primarily addressing your customer’s desires to move their IT infrastructure out of their shop, an ASP model hosted in the cloud is fine. However, if your investors or Wall Street are viewing you as a SaaS company, ASP gross margins will not be accepted. A SaaS company is expected to produce in excess of 80% gross margin, whereas an ASP model typically caps out at around 60 or 70%.
How to Avoid?
Make a decision up front on what you want your gross and operating margins to be. This decision will guide how you sell your product (e.g.: highest margins require no code-level customizations), how you architect your systems (multi-tenancy provides greater economies of scale), and even how you release (you, not your customers, control release timing and frequencies). A note of warning: without SaaS margins, you will likely face pricing pressure from an existing or entrant competitor who achieved SaaS margins.
4. Tack the word ‘cloud’ on to your existing on-prem product and host it
Often a direct result of the above mistake, this is an ASP (Application Service Provider) model, not a SaaS one. While this exercise might be useful in exploring some hosting aspects, it won’t truly inform you about what your product and organization needs to become in order to successfully migrate to SaaS. It will result in nowhere near the gross and operating margins true SaaS provides, and your Board and Wall Street expect to see. As discussed in The Many Meanings of Cloud, the danger of tacking the word “Cloud” onto your product offering is that your company will start believing you are a “Contender”, and will stop pushing for true SaaS.
How to Avoid?
Again, if an ASP model is ‘good enough’, fine - just don’t label or market yourselves as SaaS. If you start the SaaS journey with an ASP model, make sure all within your company recognize your ASP implementation is a dead-end, a short-term solution, and that the real endpoint is a true SaaS offering.
3. Target Your Existing Customer Base
Many companies are so focused upon their existing customer base that they forget about entire markets that might be better suited for SaaS. The mistaken perception is
“We need to take customers along with us”,
when in fact, the SaaS reality is
“We need to use the Technology Adoption Lifecycle, compete with ourselves and address a different or smaller customer base first”.
How to Avoid?
Ignore your current customers and find early adopters to target, even if your top customers are pushing you to move to SaaS. A move to true SaaS from your on-premise product almost always requires significant architectural changes. Putting your entire product and code base through these changes will take time.
Instead, apply the Crossing the Chasm Bowling Alley strategy to grow your SaaS offering into your current customer base rather than fork-lifting your current solution in an ASP fashion into the cloud…
Carve out key slices of functionality from your product that can provide value in a standalone fashion, and find early adopters to help shake out your new offering. Even if you are in a ‘laggard’ industry (banking, healthcare, education, insurance), you will find early adopters within your targeted customer base. Seek them out; they will likely be far better partners in your SaaS migration than your existing ones.
2. Ignore Risk Shifts
It may come as a surprise to some within your company to find out how much risk your customers bear in order to host and use your product on-premise. These risks include security, availability, capacity, scalability, and disaster recovery. They also include costs such as software licensing that have been passed through to your customers, but are now yours, part of your operating margins.
How to Avoid?
Many of these “-ilities” are likely to be new disciplines that now must be instilled in your company, some by hiring key individuals (e.g.: a CISO), some through additional focus and rigor during your PDLC. Part of the process of rearchitecting for SaaS includes ensuring you have adequate scalability, are designed with availability in mind, and a production topology that enables disaster recovery. Where vertical scalability might be acceptable in your on-prem world (“just buy a bigger machine”), you now need to ensure you have horizontal scalability, ideally in an elastic form. The cost of proprietary software (e.g.: database licenses) is now yours to carry, and a shift to open source software can significantly improve your margins. These “-ilities” are also known as non-functional requirements (NFRs), and need to be considered with at least as much weight as your functional requirements during your backlog planning and prioritization.
And now, the biggest mistake we see made in SaaS migrations…
1. Underestimate Inertia
Inertia is a powerful force. Over the years spent building up your on-premise capabilities, you’ve almost certainly developed tremendous inertia - defined for our purposes as “a tendency to do nothing or remain unchanged”. In order to achieve true SaaS, in order to satisfy your investors and reach SaaS-like multiples, nearly every part of your company needs to act differently.
How to Avoid?
First, ensure your entire company is ready to embrace change. For many companies, the move to SaaS is the only answer to an existential threat that a known (or unknown competitor) presents, one who is listening to your customers say “Get this IT stuff out of my shop!”. Examples of SaaS disruptors include:
- Salesforce destroying Siebel
- ServiceNow vs. Remedy
- Workday taking on Oracle/Peoplesoft
Is there a disruptor waiting to take over your business?
Many companies choose to disrupt themselves, and after switching to SaaS, drive their stock price through the roof. Look at Adobe’s stock price once they fully embraced (or made their customers embrace) SaaS over packaged software.
Regardless of how you position the importance of SaaS to your employees, there will still be some that are stuck by inertia in the on-prem ways. Either relegate them to stay with the on-prem effort, or ‘weed’ them out.
Once you’ve got some built up momentum and desire within your company to make the move to SaaS, make a concerted examination department-by-department to determine how they will need to change. All involved need to recognize the risk shifts as mentioned, and associated required mindsets. While you likely have excellent, seasoned on-prem employees, do you have enough SaaS experience across each team? The SaaS migration should in no way be treated solely as an engineering exercise.
It always comes as a shock how many departments need to break out of their existing inertia, and act differently. Some examples:
- Can no longer promise code customizations.
- Need to address cloud security concerns.
- Must ensure existing customers know they can no longer dictate when releases occur.
- Learn to speak ARR (annual recurring revenue).
- Should look at alternative revenue schemes (seat vs. utility).
- SaaS presents changes in revenue recognition. Be prepared.
- Should focus in iterative releases that enable product discovery.
- Must learn to balance the NFRs/”-ilities” along with new features.
- Need to consider alternative customers and markets for your new SaaS offering.
- Will likely need to become continually engaged in the PDLC process, in order to stay abreast of releases occurring at far greater frequency than old.
- Must develop (along with engineering) incident management processes that deal with multiple customers simultaneously having issues.
- Better spin up this department in a hurry!
- As no code-level customizations should be happening, you might end up reducing this team, focusing them more on integrations with your customers’ IT or 3rd party products.
Hmm, so much change for this department. Where to start? Start by bringing AKF on board to examine your SaaS migration effort. Here is where we can help you the most.
August 21, 2018 | Posted By: Larry Steinberg
Open Source Software (OSS) is an efficient means to building out solutions rapidly with high quality. You utilize crowdsourced design, development and validation to conveniently speed your engineering. OSS also fosters a sense of sharing and building together - across company boundaries or in one’s free time.
So just pull a library down off the web, build your project, and your company is ready to go. Or is that the best approach? What could be in this library you’re including in your solution which might not be wanted? This code will be running in critical environments - like your SaaS servers, internal systems, or on customer systems. Convenience comes at a price and there are some well known situations of hacks embedded in popular open source libraries.
What is the best approach to getting the benefits of OSS and maintaining the integrity of your solution?
Good practices are a necessity to ensure a high level of security. Just like when you utilize OSS and then test functionality, scale, and load - you should be validating against vulnerabilities. Pre-production vulnerability and penetration testing is a good start. Also, utilize good internal process and reviews. Keep the process simple to maintain speed but establish internal accountability and vigilance on code that’s entering your environment. You are practicing good coding techniques already with reviews and/or peer coding - build an equivalency with OSS.
Always utilize known good repositories and validate the project sponsors. Perform diligence on the committers just like you would for your own employees. You likely perform some type of background check on your employees before making an offer - whether going to a third party or simply looking them up on linkedin and asking around. OSS committers have the same risk to your company - why not do the same for them? Understandably, you probably wouldn’t do this for a third party purchased solution, but your contract or expectation is that the company is already doing this and abiding by minimum security standards. That may not be true for your OSS solutions, and as such your responsibility for validation is at least slightly higher. There are plenty of projects coming from reputable sources that you can rely on.
Ensure that your path to production is only coming from artifacts which have been built on internal sources which were either developed or reviewed by your team. Also, be intentional about OSS library upgrades, this should planned and part of the process.
OSS is highly leveraged in today’s software solutions and provides many benefits. Be diligent in your approach to ensure you only see the upside of open source.
Need additional help? Contact Us!
August 9, 2018 | Posted By: Greg Fennewald
AKF Partners has worked with over 400 companies in our history and we’ve seen a wide variety of both good and bad things. The rise of server virtualization, the spread of NoSQL in the persistence tier, and the growing prevalence of cloud hosting are some of the technology developments in recent years. In the information security arena, there are several practices that are a good indicator of overall security program efficacy. Do them well and your security program is probably in good shape. Do them poorly – or not at all – and your security program might be headed for trouble.
1. Annual Security Training and Testing
Everyone loathes mandated training topics, especially those that require a defined amount of time be spent on the training (many of which are legislative requirements). There’s no reliable method to make security training fun or enjoyable, so let’s hold our noses and focus on why it is important:
- Testing establishes accountability – people do not want to fail and there should be consequences for failure.
- Security threats change over time – annually recurring training provides a vehicle for updating awareness on current threats. Look through the OWASP Top 10 for several years to see how threats change.
- Recurring training and testing are becoming table stakes – any audit is going to start with asking about your training and awareness program.
2. Security Incident Response Plan
An IRP is not amongst the first few security policies a company needs, but when it is needed, it is needed urgently.
- A security incident is virtually a certainty over a sufficiently large time horizon.
- Similar to parachutes and fire extinguishers, planning and practice dramatically improve results.
- Evolving data privacy regulations, GDPR for instance, are likely to heighten incident disclosure requirements – a solid IRP will address disclosure.
3. Open Source Software Inventory
Open source software inventory? How is that related to security? Many consider OSS inventory as a compliance requirement – ensuring the company complies with the licensing requirements of the open source components used, particularly important if the business redistributes the software. OSS inventory also has security applicability.
- Provides ability to identify risks when open source component vulnerabilities and exploits are disclosed – what’s in your stack and is the latest exploit a risk to your business?
- Most effective when coupled with a policy on how new open source components can be safely utilized.
- Lends itself well to automation and tooling with security resource oversight.
- Efficient, serving two purposes – open source license compliance and security vulnerabilities.
What do these three security practices have in common? People – not technology. Firewall rules and the latest intrusion detection tools are not on this list. Many security breaches occur as the result of a human error leading to a compromised account or improper system access. Training and testing your people on the basics, having a plan on how to respond should an incident occur, and being able to know if an open source disclosure affects your risk profile are three human-focused practices that help establish a security-minded culture. Without the proper culture, tools and automation are less likely to succeed.
Looking for additional help? Contact us!
August 2, 2018 | Posted By: Marty Abbott
The movement to SaaS specifically, and more broadly “Anything” (X) as a Service (XaaS) is driven by demand side (buyer) forces. In early cases within any industry, the buyer seeks competitive advantage over competitors. The move to SaaS allows the buyer to focus on core competencies, increasing investments in the areas that create true differentiation. Why spend payroll on an IT staff to support ERP solutions, mail solutions, CRM solutions, etc when that same payroll could otherwise be spent on engineers to build product differentiating features or enlarge a sales staff to generate more revenue?
As time moves on and as the technology adoption lifecycle advances, the remaining buyers for any product feel they have no choice; the talent and capabilities to run a compelling solution for the company simply do not exist. As such, the late majority and laggard adopters are almost “forced” into renting a service over purchasing software.
Whether for competitive reasons, as in the case of early adopters through the early majority, or for lack of alternatives as in the case of late majority and laggards, the movement to SaaS and XaaS represents a shift in risk as compared to the existing purchased product options. This shift in risk is very much like the shift that happens between purchasing and leasing a home.
Renting a home or an apartment is almost always more expensive than owning the same dwelling. The reason for this should be clear: the person owning the property expects to make a profit beyond the costs of carrying a mortgage and performing upkeep on the property over the life of the owner’s investment. There are special “inversion” cases where renting is less expensive, such as in a low rental demand market, but these cases tend to reset the market ownership prices (house prices fall) as rents no longer cover mortgages or ownership does not make sense.
Conversely, ownership is almost always less expensive than renting or leasing. But owners take on more risk: the risk of maintenance activities; the risk of market prices; the risk and costs associated with remodeling to make the property attractive, etc.
The matrix below helps put the shift described above into context.
A customer who “owns” an on-premise solution also “owns” a great deal of risk for all of the components necessary to achieve their desired outcomes: equipment, security, power, licenses, the “-ilities” (like availability), disaster recovery, release management, and monitoring of the solution. The primary components of this risk include fluctuation in asset valuation, useful life of the asset, and most importantly – the risk that they do not have the right skills to maximize the value creation predicated on these components.
A customer who “rents” a SaaS solution transfers most of these risks to a provider who specializes in the solution and therefore should be better able to manage the risk and optimize outcomes. In exchange, the customer typically pays a risk premium relative to ownership. However, given that the provider can likely operate the solution more cost effectively, especially if it is a multi-tenant solution, the risk premium may be small. Indeed, in extreme cases where the company can eliminate headcount, say after eliminating all on-premise solutions, the lessee may experience an overall reduction in cost.
But what about the provider of the service? After all, the “old world” of simply slinging code and allowing customers to take all the risk was mighty appealing; the provider enjoyed low costs of goods sold (and high gross margins) and revenue streams associated with both licensing and customization. The provider expects to achieve higher revenue from the risk premium charged for services. The provider also expects overall margins through greater efficiencies in running solutions with significant demand concentration at scale. The risk premium more than compensates the provider for the increased cost of goods sold relative to the on-premise business. Overall, the provider assumes risk for greater value creation. Both the customer and the provider win.
Architecture and product financial decisions are key to achieving the margins above.
Gross margins are directly correlated with the level of tenancy of any XaaS provider (Y axis). As such, while we want to avoid “all tenancy” for availability reasons, we desire a high level of tenancy to maximize equipment utilization and increase gross margins. Other drivers of gross margins include the level of demand upon shared components and the level of automation on all components – the latter driving down cost of labor.
The X axis of the chart above shows the operating expense associated with various business models. Multi-tenant XaaS offerings collapse the number of “releases supported in the wild” – reducing the operating expense (and increasing gross margins) associated with managing a code base.
Another way of viewing this is to look at the relative costs of software maintenance and administration costs for various business models.
Plotted in the “low COGS” (X axis), “low Maintenance quadrant of the figure above is “True XaaS”. Few versions of a release reduce our cost to maintain a code base, and high equipment utilization and automation reduces our cost to provision a service.
In the upper right and unattractive quadrant is the ASP (Application Service Provider) model, where we have less control over equipment utilization (it is typically provisioned for individual customers) and less control over defining the number of releases.
Hosting a solution on-premise to the customer may reduce our maintenance fees, if we are successful in reducing releases, but significantly increases our costs. This is different than the on-premise model (upper left) in which the customer bears the cost of equipment but for which we have a high number of releases to maintain. The XaaS solution is clearly beneficial overall to maximize margins.
AKF Partners helps companies transition on-premise, licensed software products to SaaS and XaaS solutions. Let us help you on your journey.
July 24, 2018 | Posted By: Marty Abbott
Over a decade of helping on-premise and licensed software companies through the transition to “Something as a Service” – whether that be Software (SaaS), Platform (PaaS), Infrastructure (IaaS), or Analytics (AaaS) – has given us a rather unique perspective on the various phases through which these companies transition. Very often we find ourselves in the position of a counselor, helping them recognize their current phase and making recommendations to deal with the cultural and operational roadblocks inherent to that phase.
While rather macabre, the phases somewhat resemble those of grieving after the loss of a loved one. The similarities make some sense here, as very often we work with companies who have had a very successful on-premise and/or licensed software business; they dominated their respective markets and “genetically” evolved to be the “alphas” in their respective areas. The relationship is strong, and their past successes have been very compelling. Why would we expect anything other than grieving?
But to continue to evolve, succeed, and survive in light of secular forces these companies must let their loved one (the past business) go and move on with a new and different life. To be successful, this life will require new behaviors that are often diametrically opposed to those that made the company successful in their past life.
It’s important to note that, unlike the grieving process with a loved one, these phases need not all be completed. The most successful companies, through pure willpower of the management team, power through each of these quickly and even bypass some of them to accelerate to the healing phase. The most important thing to note here is that you absolutely can move quickly – but it requires decisive action on the part of the executive team.
Phase 1: Denial This phase is characterized by the licensed/on-premise software provider completely denying a need to move to an “X” (something) as a Service (XaaS, e.g. SaaS, PaaS) model.
Commonly heard quotes inside the company:
- “Our customers will never move to a SaaS model.”
- “Our customers are concerned about security. IaaS, SaaS and PaaS simply aren’t an option.”
- “Our industry won’t move to a Cloud model – they are too concerned about ownership of their data.”
- “To serve this market, a solution needs to be flexible and customizable. Proprietary customer processes are a competitive advantage in this space – and our solution needs to map exactly to them.”
Reinforcing this misconceived belief is an executive team, a sales force, and a professional services team trapped in a prison of cognitive biases. Hypothesis myopia and asymmetric attention (both forms of confirmation bias) lead to psychological myopia. In our experience, companies with a predisposed bias will latch on to anything any customer says that supports the notion that XaaS just won’t work. These companies discard any evidence, such as pesky little startups picking up small deals, as aberrant data points.
The inherent lack of paranoia blinds the company to the smaller companies starting to nibble away at the portions of the market that the successful company’s products do not serve well. Think Seibel in the early days of Salesforce. The company’s product is too expensive and too complex to adequately serve the smaller companies beneath them. The cost of sales is simply too high, and the sales cycle too long to address the needs of the companies adopting XaaS solutions. In this phase, the company isn’t yet subject to the Innovator’s Dilemma as the blinders will not let them see it.
Ignorance is bliss…for awhile…
How to Avoid or Limit This Phase
Successful executive teams identify denial early and simply shut it down. They establish a clear vision and timeline to move to the delivery of a product as a service. As with any successful change initiative, the executive team creates, as a minimum:
- The compelling reason and need for change. This visionary element describes the financial and operational benefits in clear terms that everyone can understand. It is the “pulling” force necessary to motivate people through difficult times.
- A sense of fear for not making the change. This fear becomes the “stick” to the compelling “carrot” above. Often given the secular forces, this fear is quite simply the slow demise of the company.
- A “villain” or competitor. As is the case in nearly any athletic competition, where people perform better when they have a competitor of equivalent caliber (vs say running against a clock), so do companies perform better when competing against someone else.
- A “no excuses” cultural element. Everyone on the team is either committed to the result, or they get removed from the team. There is no room for passive-aggressive behavior, or behaviors inconsistent with the desired outcome. People clinging to the past simply prolong or doom the change initiative. Fast success, and even survival, requires that everyone be committed.
Phase 2: Reluctant but Only Partial Acceptance
This phase typically starts when a new executive, or sometimes a new and prominent product manager, is brought into the company. This person understands at least some of – and potentially all of – the demand side forces “pulling” XaaS across the curve of adoption, and notices the competition from below. Many times, the Innovator’s Dilemma keeps the company from attempting to go after the lower level competitors.
Commonly heard quotes inside the company:
- “Those guys (referring to the upstarts) don’t understand the complexities of our primary market – the large enterprise.”
- “There’s a place for those products in the SMB and SME space – but ‘real’ companies want the security of being on-premise.”
- “Sure, there are some companies entertaining SaaS, but it represents a tiny portion of our existing market.”
- “We are not diluting our margins by going down market.”
The company embarks upon taking all their on-premise solutions and hosting them, nearly exactly as implemented on-premise, as a “service”.
Many of the company’s existing customers aren’t yet ready to migrate to XaaS, but discussions are happening inside customer companies to move several solutions off-premise including email, CRM and ERP. These customers see the possibility of moving everything – they are just uncertain as to when.
How to Avoid or Limit This Phase
The answer for how to speed through or skip this phase is not significantly different than that of the prior phase. Vision, fear of death, a compelling adversary, and a “no excuses” culture are all necessary components.
Secular forces drive customers to seek a shift of risk. This shift is analogous to why one would rent instead of owning a home. The customer no longer wants the hassle of maintenance, nor are they truly qualified to perform that maintenance. They are willing to accept some potential increase in costs as capex shifts to opex, to relieve themselves of the burden of specializing in an area for which they are ill-equipped.
If not performed during the Denial phase, now is the time to remove executives who display behaviors inconsistent with the new desired SaaS Principles
Phase 3: Pretending to Have an Answer
The pretending phase starts with the company implementing essentially an on-premise solution as a “SaaS” solution. With small modifications, it is exactly what was shipped to customers before but presented online and with a recurring revenue subscription model. We often like to call this the “ASP” or “Application Service Provider” model. While the revenue model of the company shifts for a small portion of its revenue to recurring services fees, the solution itself has not changed much. In fact, for older client-server products Citrix or the like is often deployed to allow the solution to be hosted.
The product soon displays clear faults including lower than desired availability, higher than necessary cost of operations, higher than expected operating expenses, and lower margins than competitors overall. Often the company will successfully hide these “SaaS” results as a majority of their income from operations still come from on-premise solutions.
The company will often use nebulous terms like “Cloud” when describing the service offering to steer clear of direct comparisons with other “born SaaS” or “true SaaS” solutions. Sometimes, in extreme cases, the company will lie to itself about what they’ve accomplished, and it will almost always direct the conversation to topics seen as differentiating in the on-premise world rather than address SaaS Principles.
Commonly heard quotes inside the company:
- “Our ‘Cloud’ solution is world class – it’s the Mercedes of all solutions in the space with more features and functionality than any other solution.”
- “The smaller guys don’t have a chance. Look at how long it will take them to reach feature parity. The major players in the industry simply won’t wait for that.”
- “We are the Burger King of SaaS providers – you can still have it your way. And we know you need it your way.”
Meanwhile, customers are starting to look at true SaaS solutions. They tire of availability problems, response time issues, the customization necessary to get the solution to work in a suitable fashion and the lack of configurability. The lead time to implementation is still too long.
Sales people continue to sell the product the same way; promising whatever a customer wants to get a sale. Engineers still develop the same way, using the same principles that made the company successful on-premise and completely ignorant of the principles necessary to be successful in the SaaS world.
How to Avoid or Limit This Phase
It’s not completely a bad thing to launch, as a first step, a “hosted” version of a company’s licensed product. But the company must understand internally that it is only an interim step.
In addition to the visionary and behavioral components of the previous phases, the company now must accept and be planning for a smaller functionality solution that will be more likely adopted by “innovators” and “early majority” companies. The concept of MVP relative to the Technology Adoption Lifecycle is important here.
Further, the company must be aggressively weeding product, sales, and technology executives who lack the behaviors or skills to be successful and seeding the team with people who have “done this before”. Sales teams must act similarly to used car sales people in that they can only sell “what is on the lot” that will fit customer “need”, as compared to new car sales people who can promise options and colors from the factory (“It will take more time”) that more precisely fit a customer’s “want”.
Phase 4: Fear
The company loses its first major deal or two to a rival product that appears to be truly SaaS and abides by SaaS Principles. They realize that their ASP product simply isn’t going to cut it, and Sales people are finally “getting” that the solution they have simply won’t work. The question is: Is the company too late? The answer depends on how long it took the company to get to this position. A true SaaS solution is at the very least months away and potentially years away. If the company moves initially right to the “Fear” stage and properly understands the concepts behind the TALC, they have a chance.
Commonly heard quotes inside the company:
- “We’re screwed unless we get this thing re-architected in order to properly compete in the XaaS space.”
- “Stop behaving like we used to – stop promising customizations. That’s not who we are anymore.”
- “The new product needs to do everything the old product did.” [Incorrect and prone to failing]
- “Think smaller, and faster. Think some of today’s customers – not all of them – for the first release. Understand MVP is relative to the TALC.” [Correct and will help drive success]
How to Avoid or Limit This Phase
This is the most easily avoided phase. With proper planning and execution in prior phases, a company can completely skip the fear stage.
When companies find themselves here, its typically because they have not addressed the talents and approach of their sales, product, and engineering teams. Sales behaviors must change to only sell what’s “on the car lot”. Engineers must understand how to build the “-ilities” into products from day 1. Product managers must switch to identifying market need, rather than fulfilling customer want. Executives must now run an entirely different company.
Final Phase: Healing or Marginalization
Companies successful enough to achieve this phase do so only through launching a true XaaS product – one that abides by XaaS principles built and run by executives, managers and individual contributors who truly understand or are completely wedded to learning what it means to be successful in the XaaS world.
The phases of grief are common among many of our customers. But unlike grieving for a loved one, they are not necessary. Quick progression, or better yet avoidance, of these phases can be accomplished by:
- Establishing a clear and compelling vision based on market and secular force analysis, and an understanding of the technology adoption lifecycle. As with any vision, it should not only explain the reason for change, but the positive long-term financial impact of change.
- Ensuring that everyone understands the cost of failure, helping to instill some small level in fear that should help drive cultural change.
- Ensuring that a known villain or competitor exists, against which we are competing to help boost performance and speed of transition.
- Aggressively addressing the cultural and behavioral changes necessary to be successful. Anyone who is not committed and displaying the appropriate changes in behavior needs to be weeded from the garden.
This shift often results in a significant portion of the company departing – sometimes willingly and sometimes forcefully. Some people don’t want to change and can find other companies (for a while) where their skills and behaviors are relevant. Some people have the desire, but may not be capable of changing in the time necessary to be successful.
Image Credit: Tim Gouw from Pexels
July 24, 2018 | Posted By: Greg Fennewald
Contemplating how to scale your website is not often a paramount concern in the early days when the focus is on time to market and core functionality. Growth over time will eventually force you to architect a solution to get ahead of the growth and deliver the availability and scalability the business demands. The AKF Scale Cube is one way to frame the scalability challenge.
The AKF Scale Cube has three axes by which a website can be scaled. X axis, or horizontal duplication, is a common first choice. Duplicating web and app servers into load balanced pools is a best practice that provides the ability to conduct maintenance or roll code without taking the site down as well as scalability if the pools are N+1 or even N+2.
The Y axis is a service split - decomposing a monolithic code base into smaller services than can run independently. The Y axis also allows you to scale your technology team - teams focus on the services for which they are responsible and no longer need complete domain expertise. Y axis does require a lot of development work though, competing for resources the business wants to use for new features.
You can read more about the Scale Cube and the X and Y axes here.
The third axis is Z, a lookup oriented split. A common choice is geographic location or customer identity. A Z axis split takes a monolithic stack and slices it into N smaller stacks, using fewer or smaller servers and working with 1/Nth of the data.
A case can be made that a Z axis split is your best option when X axis is losing effectiveness and infrastructure costs are becoming unsustainable. Consider this situation; you have already implemented X axis split by deploying load balanced pools of web and app servers. You’ve gone a step further by deploying read-only DB replicas to handle reporting workloads, preserving compute power for the write DB. It’s not enough though, your production DB is wheezing after going up one flight of stairs.
The business intentionally took on technical debt in the form of stored procedures for business logic early on to improve time to market. Development resources are now slowly removing those and writing the business logic in the app tier. New features are still needed so there are no resources to begin on the development work needed for a Y axis split.
X axis is running out of steam and increasing costs and you do not have resources to work on a Y axis split in the near term. Z axis can save the day and provide some breathing room. Z axis does not take as much development work as Y does and the X axis infrastructure work your team has already done will be similar to building smaller Z axis stacks. Your team develops cookie cutter stacks via automation and scripting that handle 2,500 customers well. You start a new stack when the previous one reaches 2,000 customers to leave some depth of data growth. These smaller stacks reduce your growth costs as compared to Z axis alone. The business has time to complete the stored procedure removal before turning towards Y axis service splits all the while delivering on the feature roadmap. You keep your job.
‹ First < 2 3 4 5 6 > Last ›