June 28, 2018 | Posted By: Pete Ferguson
In our technical due diligence reviews we conduct for investment firms, we see five common mistakes by both young startups and in well-seasoned companies alike:
Lack of Security Mindset During Development: Security gets a bad rapport for being overkill, costly, and a hindrance to fast growth when security teams fall into the “No It All” philosophy of saying “No” especially when their focus on security overshadows any considerations of what revenue they halt or hinder in saying “No.” This tends to be true when organizations do not share the same risk related goals, or when the security organization feels that it is only responsible for risk alone and not the maximization of revenue with appropriate risk. Good security requires a team effort and is difficult to add into a well-oiled machine as an afterthought. It is much easier to do at the onset of writing code and including checks for security with the automation of testing and QA will ensure security is baked in. Hold developers responsible for security and security responsible for enabling revenue while protecting the company in a common sense approach.
Failing to Separate Duties: Usually as small companies grow larger, everyone continues to have access to everything, and many original employees wear multiple hats. Making sure no one individual is responsible for development to production gains points in other areas like business continuity as well. Separation of duties does not just exist between two employees - the separation can also be created by automation as is the case in many successful continuous deployment/delivery shops deploying directly into production. Automated testing will additionally help with code compliance and quality assurance. Automate access control by role wherever possible and regularly have business owners review and sign off on access control lists (at least monthly). My colleague James Fritz goes into greater detail in a separate article.
Not Segregating and Encrypting Sensitive Data At Rest: Encrypting all data at rest may not make sense, but segregating all personal identifiable information (PII), financial, medical, and any other sensitive or confidential information into a separate, encrypted database is a better attack plan. Even if you are not required to be under PCI or HIPPA or other regulations, limiting exposure to your customer and company confidential information is a best practice. You can add additional protections by tokenizing the information wherever possible. When there is a security breach (probably safe in today’s climate to say “when” not “if” there is a breach), it is really hard to try and explain to your customers why you didn’t encrypt their sensitive data at all times. Given recent headlines, this is now considered entry level security table stakes and a safeguard required by your customers - and no longer a nice to have optional item.
Checklist Only Mentality: In our experience, many auditors have been focused primarily only on checklist compliance until recently - but times are changing and the true test of compliance is moving from a checklist and certification to trying to explain your most recent data breach to your customers and stakeholders. Constantly working towards safeguarding your customers and serving them will likely mean you easily fall within current and future security requirements or can get there quickly. It is much easier to design security into your products now than to be relegated to doing it later because of a misstep and it will do a lot more for customer adoption and retention.
These are just a summary of five common findings – there are certainly many others. The common denominator we find with successful companies is that they are thinking holistically about their customers by automatically building security into their products and are able to scale and expand into new market segments more readily. Building in security as a part of a holistic approach will address areas in business continuity, disaster recovery, resiliency, being able to roll back code, etc.
Under the Hood - Our Security Questions for Technical Due Diligence
In our assessments, we cover each of the areas below - using these questions as guidelines for conversation - not a point-by-point Q&A. These are not a yes/no checklist, we rank our target based on other similarly sized clients and industry averages. Each question receives a ranking from 1-4, with 4 being the highest score and then we graph our findings against similar and competing companies within the market segment.
- Is there a set of approved and published information security policies used by the organization?
- Has an individual who has final responsibility for information security been designated?
- Are security responsibilities clearly defined across teams (i.e., distributed vs completely centralized)?
- Are the organization’s security objectives and goals shared and aligned across the organization?
- Has an ongoing security awareness and training program for all employees been implemented?
- Is a complete inventory of all data assets maintained with owners designated?
- Has a data categorization system been established and classified in terms of legal/regulatory requirements (PCI, HIPAA, SOX, etc.), value, sensitivity, etc.?
- Has an access control policy been established which allows users access only to network and network services required to perform their job duties?
- Are the access rights of all employees and external party users to information and information processing facilities removed upon termination of their employment, contract or agreement?
- Is multi-factor authentication used for access to systems where the confidentiality, integrity or availability of data stored has been deemed critical or essential?
- Is access to source code restricted to only those who require access to perform their job duties?
- Are the development and testing environments separate from the production/operational environment (i.e., they don’t share servers, are on separate network segments, etc.)?
- Are network vulnerability scans run frequently (at least quarterly) and vulnerabilities assessed and addressed based on risk to the business?
- Are application vulnerability scans (penetration tests) run frequently (at least annually or after significant code changes) and vulnerabilities assessed and addressed based on risk to the business?
- Are all data classified as sensitive, confidential or required by law/regulation (i.e., PCI, PHI, PII, etc.) encrypted in transit?
- Is testing of security functionality carried out during development?
- Are rules regarding information security included and documented in code development standards?
- Has an incident response plan been documented and tested at least annually?
- Are encryption controls being used in compliance with all relevant agreements, legislation and regulations? (i.e., data in use, in transit and at rest)
- Do you have a process for ranking and prioritizing security risks?
June 26, 2018 | Posted By: James Fritz
For many companies the thought of being audited is never a fun one. Especially when most audits are focused on documentation that is slow to react to an ever-changing environment in the Technology sector. If your company is attempting to utilize Agile methods but you fall under strict regulations for the Payment Card Industry (PCI), Health Insurance Portability and Accountability Act (HIPAA) or Sarbanes-Oxley (SOX), then there is usually a high level of trepidation when it comes to full adoption of Agile into your development. One of the biggest pitfalls that companies run into is in thinking that Separation of Duties (SoD) cannot be achieved in an Agile environment.
What is Separation of Duties?
In short, SoD is ensuring that no one person holds all the keys to enact any change. If a developer can create a product and push it into production with no checks, then this is a violation of SoD. The two basic premises this is designed to create is the elimination of fraud and the detection of errors. So, by having checks and balances along the way your company is more secure against fraud and error that would naturally occur with the introduction of human intervention.
This separation is key for any of the above-mentioned standards, PCI, HIPAA or SOX. An excerpt out of the PCI Data Security Standards states,
“In environments where one individual performs multiple roles (for example application development and implementing updates to production systems), duties should be assigned such that no one individual has end-to-end control of a process without an independent checkpoint.”
All three of the regulations share similar verbiage. No one person can have control from the beginning to the very end. But that is the inherent beauty of a fully adopted Agile system. No one person has complete control. The system relies heavily upon automation.
If you look at a normal Agile Development process (see graphic below) there are still clear timelines and interjections of human input, on top of the vast automated input provided. This creates a system that brings in a committee of personnel to achieve certain goals every sprint. When automation is laid on top of the human based work then another layer of checks and balances is established. And this automation is not just a random compiler that was pulled off the internet. It has agreed upon criteria and testing mechanisms that have been refined over time by a team that documents what is being tested against. So if an auditor were to ask, “How do you check against X?” then you would have a clear answer from the documentation that “X” is checked and logged at this stage in the process.
If your developers are utilizing proper version control and your automated systems are documented then you would have a higher chance of catching any changes, whether purposeful or accidentally, than you would in another life cycle development model. If left purely up to humans, then the ability to get distracted and lose focus would end up creating more errors than a more highly automated system would. Agile practices can help reduce human error.
Additionally, if you are looking to capture any sort of fraud then you would be able to implement more monitoring on your privileged developers specifically. Between this monitoring and the automated testing, it should be easily caught if one of your developers goes outside of a scope that they are supposed to be working in, thus protecting any sensitive data that would be required under PCI, HIPAA or SOX regulations. Not only will this automation and monitoring help stop fraud internally, it also can help detect actors on the outside attempting to gain access to your sensitive data.
Do I actually need Separation of Duties?
Yes. And if utilizing an Agile framework you already have that separation. Everything about creating an Agile environment lends itself towards highly auditable and loggable activity. This same activity is what is required for SoD. Checks and balances are achieved when you create, document and continually refine the process of how your company delivers viable product sprint after sprint. Auditors are quick to say that Agile doesn’t support the necessary separation, but usually that is because there isn’t a single person that they can put their finger on that does a certain activity. Instead they have a refined and documented product, both heavily monitored and logged, that is the independent checkpoint they require to ensure that no individual has complete end-to-end control.
If you have any further questions on how your Agile environment follows Separation of Duties, feel free to contact AKF.
June 22, 2018 | Posted By: Marty Abbott
Photo Credit: www.icomputerdenver.com
Attempting to migrate on-premise, licensed products to a recurring revenue SaaS (XaaS) solution is perhaps one of the most difficult things a company can do. As Dave Swenson writes in SaaS Migration Challenges, changes are often necessary to the fabric of the company, including the company culture and mindset.
In many cases, the principles that make a company successful in the on-premise, packaged software model are insurmountable barriers to success in XaaS. As an example of the difficulty of this transition, consider one group of opposing principles endemic to SaaS transition: building to customer want (licensed model) versus building to market need (SaaS model). For years, on-premise providers were successful by filling their product backlog with customer requests (or “wants”). When requests were specific to a single customer and didn’t appear to be extensible to a broader base, a revenue stream associated with professional services was created. Professional services, or a system integrator, would make modifications to source code creating a unique “version” for that customer. This approach led to bloated products, oftentimes with a single customer using less than 30% of a product’s capabilities. Moreover, release “skew” developed, increasing the cost of maintenance of multiple releases in “the wild”.
Contrast this with a product approach that attempts to “discover” the intersection of true market need. Product backlogs are built primarily on thorough market analysis coupled with experimentation. Customer asks (or “wants”) help inform prioritization, but are not the primary driver of product backlog. Sales organizations focus on selling the existing product rather than driving it, and the role of professional product managers emerge. This notion of “want” vs. “need” is critical to the success of a company’s transformation.
The difficulty of such a journey from “want” in the licensed product world to “need” in the XaaS world should be clear to everyone. If the difficulty is high in a transition, imagine the stressful forces a company must endure to live in both worlds for an extended period of time. Sales organizations that build upon selling a customer “whatever they desire” often revolt at being relegated to selling only what exists. Product management organizations that previously acted like “business analysts” taking requirements, now need to perform real market analysis, understand discovery and experimentation and get “closer to the market”. How does one build a company that can be successful in both worlds at once? Difficult indeed given the chasm between one approach and the other.
The figures below indicate a handful of the most common opposing forces as examples of how companies need to “reinvent themselves” and think differently to be successful as a XaaS provider.
Purchase versus Lease
At the root of many of the differences between Licensed/On-Premise products (or licensed products that are hosted in ASP-like models) and true XaaS products is the notion of what the customer expects with the sales transaction. In the licensed model, customers want outcomes but understand they are purchasing software. The customer understands that much of the risk of running the software including the “ilities” like availability and scalability are borne by the customer.
In the SaaS world, the customer mindset is that of leasing outcomes. Comparatively low switching costs (compared to on-premise) allows the customer to more easily move should they not achieve their desired outcomes. As such, service providers must move closer to the customer and better understand, in real time, customer expectations and the product’s performance to those expectations.
To highlight this difference, consider the difference between renting a home and owning a home. In most markets, the total cost of rent against the combination of amortized home values over an equivalent period and the residual (resale value) of the home indicates that renting is more expensive. But a majority of the risk of renting exists with the leasor including home maintenance, risk associated with loss to catastrophe, etc. The same is true with SaaS compared to owning a product: SaaS requires the service provider to take on significantly greater risk compared to selling a product to a customer.
Want versus Need
Henry Ford perhaps said it best when he said, “If I’d asked customers what they want, they would have told me a faster horse.” Steve Jobs agreed with Ford, indicating that people don’t know what they want until you show it to them. Here Steve is referring to true “need” vs. a stated customer “want”. Licensed products often start with customer wants. Sales teams garner desires, and either force them into product backlogs or sell modifications through professional services. Because the customer purchases the software, the company isn’t as incented as the SaaS model to ensure that the customer gets value from that which they purchased. The product development lifecycle, regardless of how it is represented to the customer, is almost always in fact waterfall.
XaaS companies attempt to “find” (aka discover) market need. Product managers analyze markets, establish goals, create hypotheses to achieve these goals, and co-create solutions to test these hypotheses with their engineering teams. What a customer may indicate as a “want” may help inform prioritization of the backlog especially if there is an intersection in customer desired outcomes.
Operations Afterthought versus Designed for Efficient Operations
In the on-premise world, the customer is responsible for the cost of operations and the effective and efficient operations of any solution. Considerations such as monitoring for early fault and incident detection are at best after thoughts in the world of on-premise software development, giving rise to an industry specializing in after market monitoring. The reasoning behind this approach is clear – time spent producing efficient solutions that can be easily maintained takes away from new feature development and new revenue associated with those features.
In the SaaS world, we are responsible for our own cost of operations and therefore building products that operate efficiently with low cost of goods sold is important to achieving high gross margins and profit margins. Furthermore, as we are responsible for all the “ilities” (described below), we must design our solutions to be monitored in real time to detect errors, faults, problems and incidents.
Revolutionary versus Evolutionary
Put simply, on-premise models rely on “fork-lift” major product upgrades that often take significant effort, downtime and customer expense to implement. The latter is often seen as a bonus to the provider of software, as upgrades can become an additional revenue stream for professional services both in assisting with the upgrade and re-implementing customizations lost during the upgrade.
XaaS products simply can’t afford the downtime, or cost associated with revolutionary implementations. As such data models and functionality evolve quickly but iteratively with the capability to completely roll back or “undo” both schema modifications and functionality additions.
Many Releases versus Single Release
Because on-premise companies can rarely force patch or release adoption across their installed base (they typically have no access to these environments), the number of releases they support may be in the 100s or even 1000s. This release skew increases operating expenses as a large portion of engineering may be spent supporting problems across a very large number of releases.
SaaS companies recognize that release skew increases the cost of development, and as such, force most customers into a single (or no more than 3) releases. Companies that fail in this principle face on-premise-like operating margins as engineering budgets increase to support releases.
Customization Equals Revenue versus Customization Equals Cost of Operations
On premise companies see customization as a valuable revenue stream, often at the expense of increased engineering budgets to support the maintenance of these customizations and the distractions they cause.
SaaS companies abhor customization and favor configuration, leading to higher quality products and better overall customer experiences.
Features First versus “-ilities” First
While all of the XaaS principles are important, one of the most difficult to grasp for our on-premise clients is the notion that the “-ilities” (sometimes called NFRs or Non Functional Requirements in waterfall development) like availability, scalability, reliability, etc are now the most important feature within their products. It pains me to even call them a feature, as they are the “table stakes” (the ante, the price we pay) for the game in which we participate.
License/on-premise companies can punt on the “-ilities” because the customer is accountable for running the solution and bears a good portion of the risk. Availability impacts like hardware failures are ultimately the responsibility of the customer. The decision to pay for high availability represented through infrastructure is also the responsibility of the customer.
These “-ilities” steal time from engineering teams – and appropriately so – to ensure the quality of the service provided. Feature production is still important – it’s just second in line behind ensuring that the solution is available for value creation.
Developers Protected versus Developers Front-Lines
Because feature production is highly correlated with revenue in the licensed software model, companies often protect developer time with multiple layers of support. When customers have outages, companies attempt to resolve them with lower cost support personnel before stealing development time from critical projects. Licensed product companies can get away with this because when there is a failure at a customer site a comparatively small portion of revenue and customer base is at risk.
In the XaaS model, multiple customers are likely impacted with any outage due to increased tenancy. And because we believe that availability is one of the most important features, we must respond with the resources necessary to resolve an incident as quickly as possible.
This is yet another sticking point with companies making the SaaS transition: sometimes significant portions of engineering organizations do not want to be on call. Those people simply have no role in a utility company-like model where “dial tone” is essential.
Part 2 describes a set SaaS principles derived from the conflict inherent to a migration from licensed products to the XaaS model.
The move from on-premise, licensed software revenue models to XaaS models is difficult. Many of the principles that make an on-premise company successful are diametrically opposed to success in SaaS. The longer a company must operate in both models, the more likely the culture, mindset and fabric of the company will be stretched. Many successful companies decide to operate the two businesses separately, to ensure that one culture is not influenced negatively by the other.
AKF Partners has aided a number of companies in their journey from on-premise to SaaS. Read more about our SaaS migration services.
June 18, 2018 | Posted By: Pete Ferguson
In my short tenure at AKF, I have found the topic of Stored Procedures (SPROCs) to be provocatively polarizing. As we conduct a technical due diligence with a fairly new upstart for an investment firm and ask if they use stored procedures on their database, we often get a puzzled look as though we just accused them of dating their sister and their answer is a resounding “NO!”
However, when conducting assessments of companies that have been around awhile and are struggling to quickly scale, move to a SaaS model, and/or migrate from hosted servers to the cloud, we find “server huggers” who love to keep their stored procedures on their database.
At two different clients earlier this year, we found companies who have thousands of stored procedures in their database. What was once seen as a time-saving efficiency is now one of several major obstacles to SaaS and cloud migration.
In our book, Scalability Rules: Principles for Scaling Web Sites, (Abbott, Martin L.. Scalability Rules: Principles for Scaling Web Sites) Marty outlines many reasons why stored procedures should not be kept in the database, here are the top 8:
- Cost: Databases tend to be one of the most expensive systems or services within the system architecture. Each transaction cost increases with each additional SPROC. Increase cost of scale by making a synchronous call to the ERP system for each transaction – while also reducing the availability of the product platform by adding yet another system in series – doesn’t make good business sense.
- Creates a Monolith: SPROCs on a database create a monolithic system which cannot be easily scaled.
- Limits Scalability: The database is a governor of scale, SPROCS steal capacity by running other than relational transactions on the database.
- Limits Automated Testing: SPROCs limit the automation of code testing (in many cases it is not as easy to test stored procedures as it is the other code that developers write), slowing time to market and increasing cost while decreasing quality.
- Creates Lockin: Changing to an open-source or a NoSQL solution requires the need to develop a plan to migrate SPROCs or replace the logic in the application. It also makes it more difficult to switch to new and compelling technologies, negotiate better pricing, etc.
- Adds Unneeded Complexity to Shard Databases: Using SPROCs and business logic on the database makes sharding and replacement of the underlying database much more challenging.
- Limits Speed To The Weakest Link: Systems should scale independently relative to individual needs. When business logic is tied to the database, each of them needs to scale at the same rate as the system making requests of them - which means growth is tied to the slowest system.
- More Team Composition Flexibility: By separating product and business intelligence in your platform, you can also separate the teams that build and support those systems. If a product team is required to understand how their changes impact all related business intelligence systems, it will slow down their pace of innovation as it significantly broadens the scope when implementing and testing product changes and enhancements.
Per the AKF Scale Cube, we desire to separate dissimilar services - having stored procedures on the database means it cannot be split easily.
Need help migrating from hosted hardware to the cloud or migrating your installed software to a SaaS solution? We have helped hundreds of companies from small startups to well-established Fortune 50 companies better architect, scale, and deliver their products. We offer a host of services from technical due diligences, onsite workshops, and provide mentoring and interim staffing for your company.
June 11, 2018 | Posted By: Marty Abbott
Of the many SaaS operating principles, perhaps one of the most misunderstood is the principle of tenancy.
Most people have a definition in their mind for the term “multi-tenant”. Unfortunately, because the term has so many valid interpretations its usage can sometimes be confusing. Does multi-tenant refer to the physical or logical implementation of our product? What does multi-tenant mean when it comes to an implementation in a database?
This article first covers the goals of increasing tenancy within solutions, then delves into the various meanings of tenancy.
Multi-Tenant (Multitenant) Solutions and Cost
One of the primary reasons why companies that present products as a service strive for higher levels of tenancy is the cost reduction it affords the company in presenting a service. With multiple customers sharing applications and infrastructure, system utilization goes up: We get more value production out of each server that we use, or alternatively we get greater asset utilization. Because most companies view the cost of serving customers as a “Cost of Goods Sold’, multitenant solutions have better gross margins than single-tenant solutions. The X Axis of the figure below shows the effect of increasing tenancy on the cost of goods sold on a per customer basis:
Interestingly, multitenant solutions often “force” another SaaS principle to be true: No more than 1 to 3 versions of software for the entire customer base. This is especially true if the database is shared at a logical (row-level) basis (more on that later). Lowering the number of versions of the product, decreases the operating expense necessary to maintain multiple versions and therefore also increases operating margins.
Single Tenant, Multi-Tenant and All-Tenant
An important point to keep in mind is that “tenancy” occurs along a spectrum moving from single-tenant to all-tenant. Multitenant is any solution where the number of tenants from a physical or logical perspective is greater than one, including all-tenant implementations. As tenancy increases, so does Cost of Goods Sold (COGS from the above figure) decrease and Gross Margins increase.
The problem with All-Tenant solutions, while attractive from a cost perspective, is that they create a single failure domain [insert https://akfpartners.com/growth-blog/fault-isolation], thereby decreasing overall availability. When something goes poorly with our product, everything is off line. For that reason, we differentiate between solutions that enable multi-tenancy for cost reasons and all-tenant solutions.
The Many Meanings and Implementations of Tenancy
Multitenant solutions can be implemented in many ways and at many tiers.
Physical and Logical
Physical multi-tenancy is having multiple customers share a number of servers. This helps increase the overall utilization of these servers and therefore reduce costs of goods sold. Customers need not share the application for a solution to be physically multitenant. One could, for instance, run a webserver, application server or database per customer. Many customers with concerns over data separation and privacy are fine with physical multitenancy as long as their data is logically separated.
Logical multi-tenancy is having data share the same application. The same webserver instances, application server instances and database is used for any customer. The situation becomes a bit murkier however when it comes to databases.
Different relational databases use different terms for similar implementations. A SQLServer database, for instance, looks very much like an Oracle Schema. Within databases, a solution can be logically multitenant by either implementing tenancy in a table (we call that row level multitenancy) or within a schema/database (we call that schema multitenancy). In either case, a single instance of the relational database management system or software (RDBMS) is used, while customer transactions are separated by a customer id inside a table, or by database/schema id if separated as such.
While physical multitenancy provides cost benefits, logical multitenancy often provides significantly greater cost benefits. Because applications are shared, we need less system overhead to run an application for each customer and thusly can get even greater throughput and efficiencies out of our physical or virtualized servers.
Depth of Multi-Tenancy
The diagram below helps to illustrate that every layer in our service architecture has an impact to multi-tenancy. We can be physically or logically multi-tenant at the network layer, the web server layer, the application layer and the persistence or database layer.
The deeper into the stack our tenancy goes, the greater the beneficial impact (cost savings) to costs of goods sold and the higher our gross margins.
The AKF Multi-Tenant Cube
To further the understanding of tenancy, we introduce the AKF Multi-Tenant Cube.
The X axis describes the “mode’ of tenancy, moving from shared nothing, to physical, to logical. As we progress from sharing nothing to sharing everything, utilization goes up and cost of goods sold goes down.
The Y axis describes the depth of tenancy from shared nothing, through network, web, app and finally persistence or database tier. Again, as the depth of tenancy increase, so do Gross Margins.
The Z axis describes the degree of tenancy, or the number of tenants. Higher levels of tenancy decrease costs of goods sold, but architecturally we never want a failure domain that encompasses all tenants.
When running a XaaS (SaaS, etc) business, we are best off implementing logical multitenancy through every layer of our architecture. While we want tenancy to be high per instance, we also do not want all tenants to be in a single implementation.
AKF Partners helps companies of all sizes achieve their availability, time to market, scalability, cost and business goals.
June 6, 2018 | Posted By: Greg Fennewald
Technology Due Diligence Checklists
What should one examine during a technical due diligence or during a technology due diligence? Are there industry best practices that should be used as a guide? Are here any particular items to request in advance? The answers to these questions are both simple and complex. Yes, there are industry best practices that can be used as a guide. Where to find them and their inherent quality is a more complex question.
AKF partners has conducted countless diligence sessions spanning greater than a decade and we’ve developed lists for information to request in advance and items to evaluate during the due diligence session. These engagements range from very small pre-product seed-round companies to very large investments in companies with revenue greater than $2Bn ARR. We share those lists with you below with a note of caution. A technical due diligence is far more than a list of “yes” or “no” answers, evaluating the responses into a comprehensive measure of technical merit and risk is the secret sauce. The following presents a high level list of technology due diligence questions and requests that can be used as a technology due diligence checklist or template.
10 Things to Request in Advance
1. Architectural overview of the platform and software.
2. Organizational chart of the technology team, including headcount, responsibilities, and any outsourced development or support.
3. Technology team budget and actuals for the previous 3 years. Ideally these should be separated by Gross Margin activities and Operating Margin activities.
4. 3d party audit reports – vulnerability scans, penetrations tests, PCI, HIPAA, ISO certification reports.
5. 3d party software in use including open source software.
6. Current product and technology roadmaps, previous 2 years of major enhancements and releases.
7. Hosting plan – data centers, colocation, and cloud plus disaster recovery and business continuity plans.
8. Previous 3 years of material technology failures.
9. Previous 5 years of security breaches/incidents/investigations.
10. Description of the software development and delivery process.
Areas to Cover During the Due Diligence
The below represent a necessary, but often insufficient, set of technical due diligence questions. These questions are intended to jump start a conversation leading to deeper levels of understanding of a service offering and the maintenance/operations of that service offering. Feel free to use them to start your own diligence program or to augment your existing program. But be careful – having a checklist, does not make one a successful pilot.
Are incoming requests load balanced? If so, how? Is any presistance required for the purposes of session affinity? If so, why?
Are the sessions being stored? If yes, where? How? Why?
Are services separated across different servers? For what purpose?
Are services sized appropriately? What are the bounds and constraints?
Are databases dedicated to a subset of services/data?
Are users segmented into independent pods?
Do services communicate asynchronously?
Do services have dedicated databases? Is any database shared between services?
Are single points of failure eliminated?
Is infrastructure designed for graceful failure? Is software designed for graceful failure? What is the evidence of both?
Is N+1 an architectural requirement at all levels?
Are active-active or multiple AZs utilized? Tested?
Are data centers located in low risk areas?
To what does the company design in terms of both RPO and RTO?
Session and State
Are the solutions completely stateless?
Where, if any place is session stored and why?
Is session required to be replicated between services?
Is session stored in browsers?
Is session stored in databases?
Is auto-scaling enabled?
Is reliance on costly 3rd party software mitigated?
Are stored procedures eliminated in the architecture?
Are servers distributed across different physical or virtual tiers?
Can cloud providers be easily switched?
Is the amount of technical debt quantified?
Is only necessary traffic routed through the firewall?
Are data centers located in low cost areas?
Can the product management team make decisions to alter features?
Are business goals owned by those who enact them?
Are success metrics used to determine when a goal is reached?
Is velocity measured?
Are coding standards documented and applied?
Are unit tests required?
Are feature flags standard?
Is continuous integration used?
Is continuous deployment utilized?
Are payloads smaller and frequent vs larger and seldom?
Can the product be rolled back if issues arise?
Is automated testing coverage greater than 75%?
Are changes being logged and made readily available to engineers?
Is load and performance testing being conducted prior to release?
Are incidents logged with enough detail to ascertain potential problems?
Are alerts sent real time?
Are systems designed for monitoring?
Are user behaviors (logins, downloads, checkouts) used to create business metric monitors?
Is remaining infrastructure headroom known?
Are post mortems conducted and fed back into the system?
Are teams seeded, fed and weeded?
Are teams aligned to the services or products they create?
Are teams cross-functional?
Do team goals aligned to top level business goals?
Do teams sit together?
Are there approved and published security policies?
Are security responsibilities clearly defined?
Does the organization adhere to legal/regulatory requirements as necessary (PCI, HIPAA, SOX, etc)?
Has an inventory of all data assets been conducted and maintained?
Is multi-factor authentication in place?
Are vulnerability scans conducted?
Is a security risk matrix maintained?
Is production system access role based and logged?
AKF has conducted countless due diligence engagements over the last decade. We can take a checklist and add to it context and real world experience to create a finished product that captures the technical risks and merits of a company, improving the quality of your diligence process.
June 6, 2018 | Posted By: Pete Ferguson
Crossing the People Chasm Within Your Organization
In Geoffrey Moore’s book “Crossing the Chasm,” he argues there is a chasm between the early adopters of a product (the technology enthusiasts and visionaries) and the early majority (the pragmatists). He illustrates well the differences in each of their self-interests and their very differing needs for security verses willing to take on risk.
People’s talents, attitudes, and skills similarly must cross the rapid growth chasm within your organization if your company is to remain viable and competitive.
As AKF Partners assess fast-growing companies in technical due diligence engagements, we often observe Moore’s chasm principle in play with an organization’s people and the ability for legacy employees to make the jump to the “next big thing” and keep up with explosive growth. Or conversely, we have also seen the “why change” attitude greatly hinder and blindside the scalability of a fledgling company.
The Chasm From Startup to Established Company
Young, well-funded startups have a lot of flare that Millenials and corporate escapees love – free food, eccentric workplaces, schedule flexibility, and very little bureaucracy, policy, or procedure. This works very well for small, talented teams during a very scrappy period of rapid growth where the common goals of the organization are well-known and lived and breathed on a daily basis and personal and group conversations with the CEO and CTO occur regularly, sometimes daily. Often there is minimal rigor around Agile rituals – and during periods of startup and rapid growth, their likely is very little time to formalize processes and the outcomes - 100-200% customer acquisition and profits - can be mistaken as a “full steam ahead” desire to not make any changes.
Recently we worked with a company that was a decade old and was fairly large compared to many startups we see in our technical due diligences for investors. The founders has seen the need to bring in experienced and open-minded senior leadership and it was inspiring to see the the vigor, enthusiasm, growth, and speed of a young fledgling company, but with defined metrics and compliance to set ground rules.
There was not an observed bureaucracy. There was clear direction.
As unfortunately this is more of an outlier than it should be, I was impressed and wanted to know what set them apart from other more mature companies I have worked with or worked for and I found several differentiators.
My observations of companies who bridge the organizational chasm of growth:
- Successful companies do not confine themselves to one segment of the market - they are thoughtful and disciplined when taking on new segments. They follow Moore’s observations well and saturate one small subset of a new market with marketing, sales, customer service and provide steep discounts to get a foothold. Once established, they expand horizontally within the subset and rinse and repeat until they are the market leader. This allows them to fail forward fast through constant innovation and iterations. This requires the people in their organizations to have an Agile mindset and not rest on their laurels.
- Successful companies have teams with a good diversity of opinions, but are unified in how they execute on their plan. The senior leadership are very successful in constantly communicating the vision of the company through desired outcomes and allow teams the autonomy to get there however they can. Because the focus is on the outcome rather than the process, there is very little bureaucratic red tape, trust is very high, and teams are not afraid to fail fast, learn, and reiterate more successfully.
- Successful companies keep things simple and team members are onboard with the company philosophy and understand how their role fits into the larger scheme of things. Google pioneered OKRs - “Objectives and Key Results” - as how they measure success within their organization. OKRs allow for nested outcomes to be defined, aligning teams with the broader company goal and successful companies have the common thread of how success is defined through outcomes from the top to the bottom of the org chart.
While some of the companies highlighted in Moore’s 1999 version of the book eventually could not cross the chasm with newer products (Blackberry, 3Comm/Palm Pilot), the principles he outlined are common to companies who are enduring today (Apple desktop to MP3 player to mobile phone/tablet to watch to … [insert Apple’s next category DOMINATOR here]).
When looking at products, according to Moore, the marketer should focus on one group of customers at a time, using each group as a base for marketing to the next group to create a bandwagon effect with momentum that spreads to others in the next marketing segment. The focus on each segment is intense and an “all hands on deck” blitz approach to include marketing, software engineering, product, customer service, sales, and others.
Similarly when it comes to what is going on inside of organizations, it is important to ensure your people cross the chasm of change required for your products to remain viable and enduring. Successful companies know that either their people have to make the transition to new skill sets/mindsets or they will need to be transitioned out of the company. Either way it’s important to inject new people with the experience needed into the organization. At AKF, we refer to this as Seed, Feed, and Weed.
What we see in successful companies is an early focus on standardization but with freedom for exploration. Allowing each team to use their own communication devices (Slack, Hive, Spark) and Agile methods is something that does not scale well. But seeking input from team members and having each team follow a standard software development cycle with similar Agile methodology does scale well and allows teams to interoperate without administrative and communicative friction.
Successful companies endure because individuals are allowed autonomy to reach shared outcomes. Tools are provided to help individuals succeed, fail forward fast, learn and share their learning, automate mundane tasks, and are not a bureaucratic bottleneck. To remain successful, companies must constantly focus on how to take their team members with them through the chasms of growth into new and emerging markets by continually upgrading their skills and contribution to the company desired outcomes.
Measuring success is not just in the stock price (many failing companies - i.e. Palm Pilot - had a good stock price while the internal decay had been going on for several quarters), it must be a thorough measurement of all aspects of the company’s technical abilities - architecture, process, organization, and security.
Technical Due Diligence Checklists
Do You Know What is Negatively Affecting Your Engineer’s Productivity? Shouldn’t You?
SaaS Migration Challenges
The No Surprises Rule
We’d love your feedback and an opportunity to see how AKF Partners can help your organization maximize outcomes. Contact us now.
June 5, 2018 | Posted By: Marty Abbott
Enough is enough already – stop using the term “Cloud”. Somewhere during the last 20 years, the term “Cloud” started to mean to product offerings what Sriracha and Tabasco mean to food: everything’s better if we can just find a way to incorporate it. Just as Sriracha makes barely edible items palatable and further enhances the flavor of delicacies, so evidently does “Cloud” confuse the unsophisticated buyer or investor and enhance the value for more sophisticated buyers and investors. That’s a nice analogy, but it’s also bullshit.
The term cloud just means too many things – some of which are shown below:
The real world of cloud offerings can be roughly separated into two groups:
This group of companies know, at some level, that they haven’t turned the corner and started truly offering “services”. They support heavy customization, are addicted to maintenance revenue streams, and offer low levels of tenancy. These companies simply can’t escape the sins of their past. Instead, they slap the term “cloud” on their product in the hopes of being seen as being relevant. At worst, it’s an outright lie. At best, it’s slightly misleading relative to the intended meaning of the term. Unless, of course, anything that’s accessible through a browser is “Cloud”. These companies should stop using the term because deep down, when they are alone with a glass of bourbon, they know they aren’t a “cloud company”.
This group of companies either blazed a path for the move to services offerings (think rentable instead of purchasable) products or quickly recognized the services revolution; they were “born cloud” or are truly embracing the cloud model. They prefer configuration over customization, and stick to the notion of a small number of releases (countable on one hand) in production across their entire customer base. They embrace physical and logical multi-tenancy both to increase margins and decrease customer costs. These are the companies that pay the tax for the term “cloud” – a tax that funds the welfare checks for the “pretenders”.
The graph below plots Cloud Pretenders, Contenders and Not Cloud products along the axes of gross margin and operating margin:
Consider one type of “Pretender” – the case of a company hosting a single tenant, client customized software release for each of their many customers. This is an ASP (Application Service Provider) model. But there is a reason the provider of the service won’t call themselves an ASP: The margins of an ASP stinks relative to that of a true “SaaS” company. The term ASP is old and antiquated. The fix? Just pour a bit of “cloud sauce” on it and everything will be fine.
Contrast the above case with that of a “Contender”: physical and logical multi-tenancy at every layer of the architecture and \ a small number of production releases (one to three) across the entire customer base. Both operating and gross margins increase as maintenance costs and hosting costs decrease when allocated across the entire customer base.
Confused? So are we. Here are a few key takeaways:
- “Cloud” should mean more than just being accessed through the internet via a browser. Unfortunately, it no longer does as anyone who can figure out how to replace their clients with a browser and host their product will call themselves a “Cloud” provider.
- Contenders should stop using the term “Cloud” because it invites comparison with companies to which they are clearly superior: Superior in terms of margins, market needs, architecture and strategic advantage.
- Pretenders should stop using the term “Cloud” for both ethical reasons and reasons related to survivability. Ethically the term is somewhere between an outright lie and an ethically contentious quibble or half-truth. Survivability comes into play when the company believes its own lie and stops seeing a reason to change to become more competitive.
AKF Partners helps companies create “Cloud” (XaaS) transition plans to transform their business. We help with financial models, product approach, market fit, product and technology architecture, business strategy and help companies ensure they organize properly to maximize their opportunity in XaaS.
The Scale Cube - Architecting for Scale
Microservices for Breadth, Libraries for Depth
SaaS Migration Challenges
When Should You Split Services?
May 20, 2018 | Posted By: Dave Berardi
Physical Bare Metal, Virtualization, Cloud Compute, Containers, and now Serverless in your SaaS? We are starting to hear more and more about Serverless computing. Sometimes you will hear it called function as a service. In this next iteration of Infrastructure-as-a-Service, users can execute a task or function without having to provision a server, virtual machine, or any other underlying resource. The word Serverless is a misnomer as provisioning the underlying resources are abstracted away from the user, but they still exist underneath the covers. It’s just that Amazon, Microsoft, and Google manage it for you with their code. AWS Lambda, Azure Functions, and Google Cloud Functions are becoming more common in the architecture of a SaaS product. As technology leaders responsible for architectural decisions for scale and availability, we must understand its pros and cons and take the right actions to apply it.
Several advantages of serverless computing include:
• Software engineers can deploy and run code without having to manage any underlying infrastructure effectively creating a No-Ops environment.
• Auto-scaling is easier and requires less orchestration as compared to a containerized environment running services.
• True On-Demand capacity – no orphaned containers or other resources that might be idling.
• They are cost effective IF we are running the right size workloads.
Disadvantages and potential landmines to watch out for:
• Landmine #1 - No control over the execution environment meaning you are unable to isolate your operational environment. Compute and networking resources are virtualized with no visibility into either of them. Availability is the hands of our cloud provider and uptime is not guaranteed.
• Landmine #2 - SLAs cannot guarantee uptime. Start-up time can take a second causing latency that might not be acceptable.
• Landmine #3 - It’s going to become much easier for engineers to create code, host it rapidly, and forget about it leading to unnecessary compute and additional attack vectors creating a security risk.
• Landmine #4 - You will create vendor lock-in with your cloud provider as you set up your event driven functions to trigger from other AWS or Azure Services or your own services running on compute instances.
AKF is often asked about our position on serverless computing. There are 4 key rules considering the advantages and the landmines that we outlined:
1) Gradually introduce it into your architecture and use it for the right use cases
2) Establish architectural principles that guide its use in your organization that will minimize availability impact for Serverless. You will tie your availability to the FaaS in your cloud provider.
3) Watch out for a false sense of security among your engineering teams. Understand how serverless works before you use it and so you can monitor it for performance and availability.
4) Manage how and what it’s used for - monitor it (eg. AWS Cloud Watch) to avoid neglect and misuse along with cost inefficiencies.
AWS, Azure, or Google Cloud Serverless platforms could provide an affective computing abstraction in your architecture if it’s used for the right use cases, good monitoring is in place, and architectural principles are established.
AKF Partners has helped many companies create highly available and scalable systems that are designed to be monitored. Contact us for a free consultation.
May 13, 2018 | Posted By: Dave Swenson
Meetings, meetings, meetings. How many times have we said that? Visiting dozens and dozens of clients per year, we see a number of customers whose culture seems to be extremely meeting-centric, as ifthe only way any decision can be made or information communicated is via a meeting.
Paul Graham, co-founder of Y Combinator and Hacker News, wrote back in 2009 of the impact of meetings upon engineers. Coding typically is best performed in multi-hour solid chunks of time, with no interruptions. It takes awhile to get into the ‘zone’, and any context switch will disrupt that zone, in Graham’s words “like throwing an exception”. He even suggests that the impact of a meeting goes far beyond the actual time spent at the meeting, that simply knowing you are going to be disrupted prevents you from reaching that zone - something like when you know you have to get up early in the morning say for a flight, you toss and turn all night long, unable to get into that deep REM state.
Many companies recognize the disruptive impact of meetings, and put rules stating ‘no meeting’ afternoons, or perhaps a full day in place. Pinterest’s recent blog post recounts their somewhat extreme move along these lines - putting a three-day no meeting block in place for engineers - engineers were not to be invited to meetings 3 days a week. The blog post is worth a read, covering some of the challenges and objections of eliminating engineer-attended meetings 3 days a week, but overall touts the success of the approach citing a 92% positive response rate to a survey question asking “Are you more productive…?”.
Really, Pinterest? Really??
I’m all for the reduction of meetings, though I do wonder if three days a week with no meetings is a bit overboard. What I’m disappointed by is that Pinterest has no (or at least did not cite any) quantifiable evidence that their engineers were actually more productive. Now, I’m not suggesting they should have a before and after count of, say, lines of code. But, assuming that Pinterest is at least something of an Agile shop, did they not see an increase in velocity, in story points being delivered?
In our visits to our clients, just as we see a wide variation in the dependency upon meetings to get anything done, we see some clients living and breathing by their team-by-team velocity numbers, while other clients totally disregard that key productivity metric. To you technology leaders out there, how better can you measure your teams’ efficiencies?
And, even more so, do you know why your teams’ current velocity is what it is? Are you actively seeking out the context switches, delays, and disruptions that are throwing exceptions in your engineers’ brains?
We’ve been pulled in many times to analyze a team’s efficiency (or lack thereof), only to find out that, yes, meetings are a negative influence, but beyond that:
- Interviews (worthwhile, but hiring should be a highly optimized process)
- Environmental issues (are you measuring your dev environments’ availability?)
- Waiting for a pull request approvals (do you have an SLA around this?)
- Long build times that are due to weak hardware or poor dependency management (compare the cost of faster build machines or code optimization of your builds vs. the value of wait or down-time of your engineers?)
- Waiting to receive clarification from a product owner on a feature (again, do you have an SLA around this? Is your team colocated, so a question can be asked/answered quickly?)
- Other surprising items ranging from having to feed a parking meter to miserable network latency for those remote engineers.
Yet again, the mantra of “If you can’t measure it, you can’t improve it” applies. We view metrics such as actual hours spent coding vs. expected hours spent coding as not only a measurement of your teams’ productivity, but as a management effectiveness gauge. Are you as a manager effectively protecting your engineers?
Are you able to see the impact of ‘no-meeting’ days, or the factors today that negatively affect your developers’ coding efficiencies?
If not, AKF can more than help. We have run productivity surveys at many clients, and always enjoy the look on technology leaders’ faces when we present the results. Let us help you.
‹ First < 2 3 4 5 6 > Last ›