Are you compromised?
September 14, 2018 | Posted By: Larry Steinberg
It’s important to acknowledge that a core competency for hackers is hiding their tracks and maintaining dormancy for long periods of time after they’ve infiltrated an environment. They also could be utilizing exploits which you have not protected against - so given all of this potential how do you know that you are not currently compromised by the bad guys? Hackers are great hidden operators and have many ‘customers’ to prey on. They will focus on a customer or two at a time and then shut down activities to move on to another unsuspecting victim. It’s in their best interest to keep their profile low and you might not know that they are operating (or waiting) in your environment and have access to your key resources.
Most international hackers are well organized, well educated, and have development skills that most engineering managers would admire if not for the malevolent subject matter. Rarely are these hacks performed by bots, most occur by humans setting up a chain of software elements across unsuspecting entities enabling inbound and outbound access.
What can you do? Well to start, don’t get complacent with your security, even if you have never been compromised or have been and eradicated what you know, you’ll never know for sure if you are currently compromised. As a practice, it’s best to always assume that you are and be looking for this evidence as well as identifying ways to keep them out. Hacking is dynamic and threats are constantly evolving.
There are standard practices of good security habits to follow - the NIST Cybersecurity Framework and OWASP Top 10. Further, for your highest value environments here are some questions that you should consider: would you know if these systems had configuration changes? Would you be aware of unexpected processes running? If you have interesting information in your operating or IT environment and the bad guys get in, it’s of no value unless they get that information back out of the environment; where is your traffic going? Can you model expected outbound traffic and monitor this? The answer should be yes. Then you can look for abnormalities and even correlate this traffic with other activities in your environment.
Just as you and your business are constantly evolving to service your customers and to attract new ones, the bad guys are evolving their practices too. Some of their approaches are rudimentary because we allow it but when we buckle down they have to get more innovative. Ensure that you are constantly identifying all the entry points and close them. Then remain diligent to new approaches they might take.
Don’t forget the most common attack vector - humans. Continue evolving your training and keep the awareness high within your staff - technical and non-technical alike.
Your default mental model should be that you don’t know what you don’t know. Utilize best practices for security and continue to evolve. Utilize external or build internal expertise in the security space and ensure that those skills are dynamic and expanding. Utilize recurring testing practices to identify vulnerabilities in your environment and to prepare against emerging attack patterns.
Open Source Software as a malware on ramp
5 Focuses for a Better Security Culture
3 Practices Your Security Program Needs
Security Considerations for Technical Due Diligence
Subscribe to the AKF Newsletter
September 10, 2018 | Posted By: Robin McGlothin
The Scalability Cube – Your Guide to Evaluating Scalability
Perhaps the most common question we get at AKF Partners when performing technical due diligence on a company is, “Will this thing scale?” After all, investors want to see a return on their investment in a company, and a common way to achieve that is to grow the number of users on an application or platform. How do they ensure that the technology can support that growth? By evaluating scalability.
Let’s start by defining scalability from the technical perspective. The Wikipedia definition of “scalability” is the capability of a system, network, or process to handle a growing amount of work, or its potential to be enlarged to accommodate that growth. That definition is accurate when applied to common investment objectives. The question is, what are the key attributes of software that allow it to scale, along with the anti-patterns that prevent scaling? Or, in other words, what do we look for at AKF Partners when determining scalability?
While an exhaustive list is beyond the scope of this blog post, we can quickly use the Scalability Cube and apply the analytical methodology that helps us quickly determine where the application will experience issues.
AKF Partners introduced the scalability cube, a scale design model for building resilience application architectures using patterns and practices that apply broadly to any application. This is a best practices model that describes all scale dimensions from “The Art of Scalability” book (AKF Partners – Abbot, Keeven & Fisher Partners).
The “Scale Cube” is composed of an X-axis, Y-axis, and Z-axis:
1. Technical Architectural Layering (X-Axis ) – No single points of failure. Duplicate everything.
2. Functional Decomposition Segmentation – Componentization to Modules & Microservices (Y-Axis). Split Report, Message, Locate, Forms, Calendar into fault isolated swim lanes.
3. Horizontal Data Partitioning - Shards (Z-Axis). Beginning with pilot users, start with “podding” users for scalability and availability.
The Scale Cube helps teams keep critical dimensions of system scale in mind when solutions are designed. Scalability is all about the capability of a design to support ever growing client traffic without compromising performance. It is important to understand there are no “silver bullets” in designing scalable solutions.
An architecture is scalable if each layer in the multi-layered architecture is scalable. For example, a well-designed application should be able to scale seamlessly as demand increases and decreases and be resilient enough to withstand the loss of one or more computer resources.
Let’s start by looking at the typical monolithic application. A large system that must be deployed holistically is difficult to scale. In the case where your application was designed to be stateless, scale is possible by adding more machines, virtual or physical. However, adding instances requires powerful machines that are not cost-effective to scale. Additionally, you have the added risk of extensive regression testing because you cannot update small components on their own. Instead, we recommend a microservices-based architecture using containers (e.g. Docker) that allows for independent deployment of small pieces and the scale of individual services instead of one big application.
Monolithic applications have other negative effects, such as development complexity. What is “development complexity”? As more developers are added to the team, be aware of the effects suffering from Brooks’ Law. Brooks’ law states that adding more software developers to a late project makes the project even later. For example, one large solution loaded in the development environment can slow down a developer and gets worse as more developers add components. This causes slower and slower load times on development machines, and developers stomping on each other with changes (or creating complex merges) as they modify the same files.
Another example of development complexity issue is large outdated pieces of the architecture or database where one person is an expert. That person becomes a bottleneck to changes in a specific part of the system. As well, they are now a SPOF (single point of failure) if they are the only resource that understands the monolithic beast. The monolithic complexity and the rate of code change make it hard for any developer to know all the idiosyncrasies of the system, hence more defects are introduced. A decoupled system with small components helps prevents this problem.
When validating database design for appropriate scale, there are some key anti-patterns to check. For example:
• Do synchronous database accesses block other connections to the database when retrieving or writing data? This design can end up blocking queries and holding up the application.
• Are queries written efficiently? Large data footprints, with significant locking, can quickly slow database performance to a crawl.
• Is there a heavy report function in the application that relies on a single transactional database? Report generation can severely hamper the performance of critical user scenarios. Separating out read-only data from read-write data can positively improve scale.
• Can the data be partitioned across different load databases and/or database servers (sharding)? For example, Customers in different geographies may be partitioned to various servers more compatible with their locations. In turn, separating out the data allows for enhanced scale since requests can be split out.
• Is the right database technology being used for the problem? Storing BLOBs in a relational database has negative effects – instead, use the right technology for the job, such as a NoSQL document store. Forcing less structured data into a relational database can also lead to waste and performance issues, and here, a NoSQL solution may be more suitable.
We also look for mixed presentation and business logic. A software anti-pattern that can be prevalent in legacy code is not separating out the UI code from the underlying logic. This practice makes it impossible to scale individual layers of the application and takes away the capability to easily do A/B testing to validate different UI changes. Layer separation allows putting just enough hardware against each layer for more minimal resource usage and overall cost efficiency. The separation of the business logic from SPROCs (stored procedures) also improves the maintainability and scalability of the system.
Another key area we dig for is stateful application servers. Designing an application that stores state on an individual server is problematic for scalability. For example, if some business logic runs on one server and stores user session information (or other data) in a cache on only one server, all user requests must use that same server instead of a generic machine in a cluster. This prevents adding new machine instances that can field any request that a load balancer passes its way. Caching is a great practice for performance, but it cannot interfere with horizontal scale.
Finally, long-running jobs and/or synchronous dependencies are key areas for scalability issues. Actions on the system that trigger processing times of minutes or more can affect scalability (e.g. execution of a report that requires large amounts of data to generate). Continuing to add machines to the set doesn’t help the problem as the system can never keep up in the presence of many requests. Blocking operations exasperate the problem. Look for solutions that queue up long-running requests, execute them in the background, send events when they are complete (asynchronous communication) and do not tie up key application and database servers. Communication with dependent systems for long-running requests using synchronous methods also affects performance, scale, and reliability. Common solutions for intersystem communication and asynchronous messaging include RabbitMQ and Kafka.
Again, the list above is not exhaustive but outlines some key areas that AKF Partners look for when evaluating an architecture for scalability. If you’re looking for a checklist to help you perform your own diligence, feel free to use ours. If you’re wondering more about our diligence practice, you may be interested in our thoughts on best practices, or our beliefs around diligence and how to get it right. We’ve performed technical diligence for seed rounds, A-series and beyond, carve-outs, strategic investments and taking public companies private. From $5 million invested to over $1 billion. No matter the size of company or size of the investment, we can help.
Open Source Software as a malware “on ramp”
August 21, 2018 | Posted By: Larry Steinberg
Open Source Software (OSS) is an efficient means to building out solutions rapidly with high quality. You utilize crowdsourced design, development and validation to conveniently speed your engineering. OSS also fosters a sense of sharing and building together - across company boundaries or in one’s free time.
So just pull a library down off the web, build your project, and your company is ready to go. Or is that the best approach? What could be in this library you’re including in your solution which might not be wanted? This code will be running in critical environments - like your SaaS servers, internal systems, or on customer systems. Convenience comes at a price and there are some well known situations of hacks embedded in popular open source libraries.
What is the best approach to getting the benefits of OSS and maintaining the integrity of your solution?
Good practices are a necessity to ensure a high level of security. Just like when you utilize OSS and then test functionality, scale, and load - you should be validating against vulnerabilities. Pre-production vulnerability and penetration testing is a good start. Also, utilize good internal process and reviews. Keep the process simple to maintain speed but establish internal accountability and vigilance on code that’s entering your environment. You are practicing good coding techniques already with reviews and/or peer coding - build an equivalency with OSS.
Always utilize known good repositories and validate the project sponsors. Perform diligence on the committers just like you would for your own employees. You likely perform some type of background check on your employees before making an offer - whether going to a third party or simply looking them up on linkedin and asking around. OSS committers have the same risk to your company - why not do the same for them? Understandably, you probably wouldn’t do this for a third party purchased solution, but your contract or expectation is that the company is already doing this and abiding by minimum security standards. That may not be true for your OSS solutions, and as such your responsibility for validation is at least slightly higher. There are plenty of projects coming from reputable sources that you can rely on.
Ensure that your path to production is only coming from artifacts which have been built on internal sources which were either developed or reviewed by your team. Also, be intentional about OSS library upgrades, this should planned and part of the process.
OSS is highly leveraged in today’s software solutions and provides many benefits. Be diligent in your approach to ensure you only see the upside of open source.
5 Focuses for a Better Security Culture
3 Practices Your Security Program Needs
Security Considerations for Technical Due Diligence
Subscribe to the AKF Newsletter
The Top 20 Technology Blunders
July 20, 2018 | Posted By: Pete Ferguson
One of the most common questions we get is “What are the most common failures you see tech and product teams make?” To answer that question we queried our database consisting of 11 years of anonymous client recommendations. Here are the top 20 most repeated failures and recommendations:
1) Failing to Design for Rollback
If you are developing a SaaS platform and you can only make one change to your current process make it so that you can always roll back any of your code changes. Yes, we know that it takes additional engineering work and additional testing to make nearly any change backwards compatible but in our experience that work has the greatest ROI of any work you can do. It only takes one really bad release in which your site performance is significantly degraded for several hours or even days while you attempt to “fix forward” for you to agree this is of the utmost importance. The one thing that is most likely to give you an opportunity to find other work (i.e. “get fired”) is to roll a product that destroys your business. In other words, if you are new to your job DO THIS BEFORE ANYTHING ELSE; if you have been in your job for awhile and have not done this DO THIS TOMORROW. (Related Content: Monitoring for Improved Fault Detection)
2) Confusing Product Release with Product Success
Do you have “release” parties? Stop it! You are sending your team the wrong message! A release has nothing to do with creating shareholder value and very often it is not even the end of your work with a specific product offering or set of features. Align your celebrations with achieving specific business objectives like a release increasing signups by 10%, or increasing checkouts by 15% or increasing the average sale price of a all checkouts by 12% or increasing click-through-rates by 22%. See #10 below on incenting a culture of excellence. Don’t celebrate the cessation of work – celebrate achieving the success that makes shareholder’s wealthy! (Related Content: Agile and the Cone of Uncertainty)
3) Insular Product Development / Engineering
How often does one of your engineering teams complain about not “being in the loop” or “being surprised” by a change? Does your operations team get surprised about some new feature and its associated load on a database? Does engineering get surprised by some new firewall or routing infrastructure resulting in dropped connections? Do not let your teams design in a vacuum and “throw things over the wall” to another group. Organize around your outcomes and “what you produce” in cross functional teams rather than around activities and “how you work.” (Related Content: The No Surprises Rule)
4) Over Engineering the Solution
One of our favorite company mottos is “simple solutions to complex problems”. The simpler the solution, the lower the cost and the faster the time to market. If you get blank stares from peers or within your organization when you explain a design do not assume that you have a team of idiots – assume that you have made the solution overly complex and ask for assistance in resolving the complexity.
Image Source: Hackernoon.com
5) Allowing History to Repeat itself
Organizations do not spend enough time looking at past failures. In the engineering world, a failure to look back into the past and find the most commonly repeated mistakes is a failure to maximize the value of the team. In the operations world, a failure to correlate past site incidents and find thematically related root causes is a guarantee to continue to fight the same fires over and over. The best and easiest way to improve our future performance is to track our past failures, group them into groups of causation and treat the root cause rather than the symptoms. Keep incident logs and review them monthly and quarterly for repeating issues and improve your performance. Perform post mortems of projects and site incidents and review them quarterly for themes.
6) Vendor Lock
Every vendor has a quick fix for your scale issues. If you are a hyper growth SaaS site, however, you do not want to be locked into a vendor for your future business viability; rather you want to make sure that the scalability of your site is a core competency and that it is built into your architecture. This is not to say that after you design your system to scale horizontally that you will not rely upon some technology to help you; rather, once you define how you can horizontally scale you want to be able to use any of a number of different commodity systems to meet your needs. As an example, most popular databases (and NoSQL solutions) provide for multiple types of native replication to keep hosts in synch.
7) Relying on QA to Find Your Mistakes
You cannot test quality into a system and it is mathematically impossible to test all possibilities within complex systems to guarantee the correctness of a platform or feature. QA is a risk mitigation function and it should be treated as such. Defects are an engineering problem and that is where the problem should be treated. If you are finding a large number of bugs in QA, do not reward QA – figure out how to fix the problem in engineering! Consider implementing test driven design as part of your PDLC. If you find problems in production, do not punish QA; figure out how you created them in engineering. All of this is not to say that QA should not be held responsible for helping to mitigate risk – they should – but your quality problems are an engineering issue and should be treated within engineering.
8) Revolutionary or “Big Bang” Fixes
In our experience, complete re-writes or re-architecture efforts end up somewhere on the spectrum of not returning the desired ROI to complete and disastrous failures. The best projects we have seen with the greatest returns have been evolutionary rather than revolutionary in design. That is not to say that your end vision should not be to end up in a place significantly different from where you are now, but rather that the path to get there should not include “and then we turn off version 1.0 and completely cutover to version 2.0”. Go ahead and paint that vivid description of the ideal future, but approach it as a series of small (but potentially rapid) steps to get to that future. And if you do not have architects who can help paint that roadmap from here to there, go find some new architects.
9) The Multiplicative Effect of Failure – Eliminate Synchronous Calls
Every time you have one service call another service in a synchronous fashion you are lowering your theoretical availability. If each of your services are designed to be 99.999% available, where a service is a database, application server, application, webserver, etc. then the product of all of the service calls is your theoretical availability. Five calls is (.99999)^5 or 99.995 availability. Eliminate synchronous calls wherever possible and create fault-isolative architectures to help you identify problems quickly.
10) Failing to Create and Incentivize a Culture of Excellence
Bring in the right people and hold them to high standards. You will never know what your team can do unless you find out how far they can go. Set aggressive yet achievable goals and motivate them with your vision. Understand that people make mistakes and that we will all ultimately fail somewhere, but expect that no failure will happen twice. If you do not expect excellence and lead by example, you will get less than excellence and you will fail in your mission of maximizing shareholder wealth. (Related Content: Three Reasons Your Software Engineers May Not Be Successful)
11) Under-Engineer for Scale
The time to think about scale is when you are first developing your platform. If you did not do it then, the time to think about scaling for the future is right now! That is not to say that you have to implement everything on the day you launch, but that you should have thought about how it is that you are going to scale your application services and your database services. You should have made conscious decisions about tradeoffs between speed to market and scalability and you should have ensured that the code will not preclude any of the concepts we have discussed in our scalability postings. Hold quarterly scalability meetings where you discuss what you need to do to scale to 10x your current volume and create projects out of the action items. Approach your scale needs in evolutionary rather than revolutionary fashion as in #8 above.
12) “Not Built Here” Culture
We see this all the time. You may even have agreed with point (6) above because you have a “we are the smartest people in the world and we must build it ourselves” culture. The point of relying upon third parties to scale was not meant as an excuse to build everything yourselves. The real point to be made is that you have to focus on your core competencies and not dilute your engineering efforts with things that other companies or open source providers can do better than you. Unless you are building databases as a business, you are probably not the best database builder. And if you are not the best database builder, you have no business building your own databases for your SaaS platform. Focus on what you should be the best at: building functionality that maximizes your shareholder wealth and scaling your platform. Let other companies focus on the other things you need like routers, operating systems, application servers, databases, firewalls, load balancers and the like.
13) A New PDLC will Fix My Problems
Too often CTO’s see repeated problems in their product development life cycles such as missing dates or dissatisfied customers and blame the PDLC itself.
The real problem, regardless of the lifecycle you use, is likely one of commitment and measurement. For instance, in most Agile lifecycles there needs to be consistent involvement from the business or product owner. A lack of involvement leads to misunderstandings and delayed products. Another very common problem is an incomplete understanding or training on the existing PDLC. Everyone in the organization should have a working knowledge of the entire process and how their roles fit within it. Most often, the biggest problem within a PDLC is the lack of progress measurement to help understand likely dates and the lack of an appropriate “product discovery” phase to meet customer needs. (Related Content: The Top Five Most Common PDLC Failures)
14) Inability to Hire Great People Quickly
Often when growing an engineering team quickly the engineering managers will push back on hiring plans and state that they cannot possibly find, interview, and hire engineers that meet their high standards. We agree that hiring great people takes time and hiring decisions are some of the most important decisions managers can make. A poor hiring decision takes a lot of energy and time to fix. However, there are lots of ways to streamline the hiring process in order to recruit, interview, and make offers very quickly. A useful idea that we have seen work well in the past are interview days, where potential candidates are all invited on the same day. This should be no more than 2 - 3 weeks out from the initial phone screen, so having an interview day per months is a great way to get most of your interviewing in a single day. Because you optimize the interview process people are much more efficient and it is much less disruptive to the daily work that needs to get done the rest of the month. Post interview discussions and hiring decisions should all be made that same day so that candidates get offers or letters of regret quickly; this will increase the likelihood of offers being accepted or make a professional impression on those not getting offers. The key is to start with the right answer that “there is a way to hire great people quickly” and the myriad of ways to make it happen will be generated by a motivated leadership team.
15) Diminishing or Ignoring SPOFs (Single Point of Failure)
A SPOF is a SPOF and even if the impact to the customer is low it still takes time away from other work to fix right away in the event of a failure. And there will be a failure…because that is what hardware and software does, it works for a long time and then eventually it fails! As you should know by now, it will fail at the most inconvenient time. It will fail when you have just repurposed the host that you were saving for it or it will fail while you are releasing code. Plan for the worst case and have it run on two hosts (we actually recommend to always deploy in pools of three or more hosts) so that when it does fail you can fix it when it is most convenient for you.
16) No Business Continuity Plan
No one expects a disaster but they happen and if you cannot keep up normal operations of the business you will lose revenue and customers that you might never get back. Disasters can be huge, like Hurricane Katrina, where it take weeks or months to relocate and start the business back up in a new location. Disasters can also be small like a winter snow storm that keeps everyone at home for two days or a HAZMAT spill near your office that keeps employees from coming to work. A solid business continuity plan is something that is thought through ahead of time, before you need it, and explains to everyone how they will operate in the event of an emergency. Perhaps your satellite office will pick up customer questions or your tech team will open up an IRC channel to centralize communication for everyone capable of working remotely. Do you have enough remote connections through your VPN server to allow for remote work? Spend the time now to think through what and how you will operate in the event of a major or minor disruption of your business operations and document the steps necessary for recovery.
17) No Disaster Recovery Plan
Even worse, in our opinion, than not having a BC plan is not having a disaster recovery plan. If your company is a SaaS-based company, the site and services provided is the company’s sole source of revenue! Moreover, with a SaaS company, you hold all the data for your customers that allow them to operate. When you are down they are more than likely seriously impaired in attempting to conduct their own business. When your collocation facility has a power outage that takes you completely down, think 365 Main datacenter in San Francisco, how many customers of yours will leave and never return? Our preference is to provide your own disaster recovery through multiple collocation facilities but if that is not yet technically feasible nor in the budget, at a minimum you need your code, executables, configurations, loads, and data offsite and an agreement in place for both collocation services as well as hosts. Lots of vendors offer such packages and they should be thought of as necessary business insurance.
If you are cloud hosted, this still applies to you! We often find in technical due diligence reviews that small companies who are rapidly growing haven’t yet initiated a second active tech stack in a different availability zone or with a second cloud provider. Just because AWS, Azure and others have a fairly reliable track record doesn’t mean they always will. You can outsource services, but you still own the liability!
Image Source: Kaibizzen.com.au
18) No Product Management Team or Person
In a similar vein to #13 above, there needs to be someone or a team of people in the organization who have responsibility for the product lines. They need to have authority to make decisions about what features get added, which get delayed, and which get deprecated (yes, we know, nothing ever gets deprecated but we can always hope!). Ideally these people have ownership of business goals (see #10) so they feel the pressure to make great business decisions.
19) Failing to Implement Continuously
Just because you call it scheduled maintenance does not mean that it does not count against your uptime. While some of your customers might be willing to endure the frustration of having the site down when they want to access it in order to get some new features, most care much more about the site being available when they want it. They are on the site because the existing features serve some purpose for them; they are not there in the hopes that you will rollout a certain feature that they have been waiting on. They might want new features, but they rely on existing features. There are ways to roll code, even with database changes, without bringing the site down (back to #17 - multiple active sites also allows for continuous implementation and the ability to roll back). It is important to put these techniques and processes in place so that you plan for 100% availability instead of planning for much less because of planned down time.
20) Firewalls, Firewalls, Everywhere!
We often see technology teams that have put all public facing services behind firewalls while many go so far as to put firewalls between every tier of the application. Security is important because there are always people trying to do malicious things to your site, whether through directed attacks or random scripts port scanning your site. However, security needs to be balanced with the increased cost as well as the degradation in performance. It has been our experience that too often tech teams throw up firewalls instead of doing the real analysis to determine how they can mitigate risk in other ways such as through the use of ACLs and LAN segmentation. You as the CTO ultimately have to make the decision about what are the best risks and benefits for your site.
Whatever you do, don’t make the mistakes above! AKF Partners helps companies avoid costly product and technology mistakes - and we’ve seen most of them. Give us a call or shoot us an email. We’d love to help you achieve the success you desire.
Subscribe to the AKF Newsletter
People Due Diligence
July 12, 2018 | Posted By: Robin McGlothin
Most companies do a thorough job of financial due diligence when they acquire other companies. But all too often, dealmakers simply miss or underestimate the significance of people issues. The consequences can be severe, from talent loss after a deal’s announcement, to friction or paralysis caused by differences in decision-making styles.
When acquirers do their people homework, they can uncover skills & capability gaps, points of friction, and differences in decision making. They can also make the critical people decisions - who stays, who goes, who runs the various lines of business, what to do with the rank and file at the time the deal is announced or shortly thereafter. Making such decisions within the first 90 days is critical to the success of a deal.
Take for example, Charles Schwab’s 2000 acquisition of US Trust. Schwab & the nation’s oldest trust company set out to sign up the newly minted millionaires created by a soaring bull market. But the cultures could not have been farther apart – a discount do-it-yourself stock brokerage style and a full-service provider devoted to pampering multimillionaires can make for a difficult integration. Six years after the merger, Chuck Schwab came out of retirement to fix the issues related to culture clash. The acquisition reflects a textbook common business problem. The dealmakers simply ignored or underestimated the significance of people and cultural issues.
Another example can be found in the 2002 acquisition of PayPal by eBay. The fact that many on the PayPal side referred to it as a merger, sets the stage for conflicting cultures. eBay was often embarrassed by the fact that PayPal invoice emails for a won auction arrived before the eBay end of auction email - PayPal made eBay look bad in this instance and the technology teams were not eager to combine. As well, PayPal titles were discovered to be one level higher than eBay titles considering the scope of responsibilities. Combining the technology teams did not go well and was ultimately scrapped in favor of dual teams - not the most efficient organizational model.
People due diligence lays the groundwork for a smooth integration. Done early enough, it also helps acquirers decide whether to embrace or kill a deal and determine the price they are willing to pay. There’s a certain amount of people due diligence that companies can and must do to reduce the inevitable fallout from the acquisition process and smooth the integration.
Ultimately, the success or failure of any deal has to do with people. Empowering people and putting them in a position where they will be successful is part of our diligence evaluation at AKF Partners. In our experience with clients, an acquiring company must start with some fundamental question:
1. What is the purpose of the deal?
2. Whose culture will the new organization adopt?
3. Will the two cultures mesh?
4. What organizational structure should be adopted?
5. How will rank-and-file employees react to the deal?
Once those questions are answered, people due diligence can focus on determining how well the target’s current structure and culture will mesh with those of the proposed new company, who should be retained and by what means, and how to manage the reaction of the employee base.
In public, deal-making executives routinely speak of acquisitions as “mergers of equals.” That’s diplomatic, politically correct speak and usually not true. In most deals, there is not only a financial acquirer, there is also a cultural acquirer, who will set the tone for the new organization after the deal is done. Often, they are one and the same, but they don’t have to be.
During our Technology Due Diligence process at AKF Partners, we evaluate the product, technology and support organizations with a focus on culture and think through how the two companies and teams are going to come together. Who the cultural acquirer is dependes on the fundamental goal of the acquisition. If the objective is to strengthen the existing product lines by gaining customers and achieving economies of scale, then the financial acquirer normally assumes the role of the cultural acquirer.
People due diligence, therefore, will be to verify that the target’s culture is compatible enough with the acquirers to allow for the building of necessary bridges between the two organizations. Key steps that are often missed in the process:
• Decide how the two companies will operate after the acquisition — combined either as a fully integrated operating company or as autonomous operating companies.
• Determine the new organizational structure and identify areas that will need to be integrated.
• Decide on the new executive leadership team and other key management positions.
• Develop the process for making employment-related decisions.
With regard to the last bullet point, some turnover is to be expected in any company merger. Sometimes shedding employees is even planned. It is important to execute The Weed, Seed & Feed methodology ongoing not just at acquisition time. Unplanned, significant levels of turnover negatively impact a merger’s success.
AKF Partners brings decades of hands-on executive operational experience, years of primary research, and over a decade of successful consulting experience to the realm of product organization structure, due diligence and technology evaluation. We can help your company successfully navigate the people due diligence process.
Tech Due Diligence - 5 Common Security Mistakes
June 28, 2018 | Posted By: Pete Ferguson
In our technical due diligence reviews we conduct for investment firms, we see five common mistakes by both young startups and in well-seasoned companies alike:
Lack of Security Mindset During Development: Security gets a bad rapport for being overkill, costly, and a hindrance to fast growth when security teams fall into the “No It All” philosophy of saying “No” especially when their focus on security overshadows any considerations of what revenue they halt or hinder in saying “No.” This tends to be true when organizations do not share the same risk related goals, or when the security organization feels that it is only responsible for risk alone and not the maximization of revenue with appropriate risk. Good security requires a team effort and is difficult to add into a well-oiled machine as an afterthought. It is much easier to do at the onset of writing code and including checks for security with the automation of testing and QA will ensure security is baked in. Hold developers responsible for security and security responsible for enabling revenue while protecting the company in a common sense approach.
Failing to Separate Duties: Usually as small companies grow larger, everyone continues to have access to everything, and many original employees wear multiple hats. Making sure no one individual is responsible for development to production gains points in other areas like business continuity as well. Separation of duties does not just exist between two employees - the separation can also be created by automation as is the case in many successful continuous deployment/delivery shops deploying directly into production. Automated testing will additionally help with code compliance and quality assurance. Automate access control by role wherever possible and regularly have business owners review and sign off on access control lists (at least monthly). My colleague James Fritz goes into greater detail in a separate article.
Not Segregating and Encrypting Sensitive Data At Rest: Encrypting all data at rest may not make sense, but segregating all personal identifiable information (PII), financial, medical, and any other sensitive or confidential information into a separate, encrypted database is a better attack plan. Even if you are not required to be under PCI or HIPPA or other regulations, limiting exposure to your customer and company confidential information is a best practice. You can add additional protections by tokenizing the information wherever possible. When there is a security breach (probably safe in today’s climate to say “when” not “if” there is a breach), it is really hard to try and explain to your customers why you didn’t encrypt their sensitive data at all times. Given recent headlines, this is now considered entry level security table stakes and a safeguard required by your customers - and no longer a nice to have optional item.
Checklist Only Mentality: In our experience, many auditors have been focused primarily only on checklist compliance until recently - but times are changing and the true test of compliance is moving from a checklist and certification to trying to explain your most recent data breach to your customers and stakeholders. Constantly working towards safeguarding your customers and serving them will likely mean you easily fall within current and future security requirements or can get there quickly. It is much easier to design security into your products now than to be relegated to doing it later because of a misstep and it will do a lot more for customer adoption and retention.
These are just a summary of five common findings – there are certainly many others. The common denominator we find with successful companies is that they are thinking holistically about their customers by automatically building security into their products and are able to scale and expand into new market segments more readily. Building in security as a part of a holistic approach will address areas in business continuity, disaster recovery, resiliency, being able to roll back code, etc.
Under the Hood - Our Security Questions for Technical Due Diligence
In our assessments, we cover each of the areas below - using these questions as guidelines for conversation - not a point-by-point Q&A. These are not a yes/no checklist, we rank our target based on other similarly sized clients and industry averages. Each question receives a ranking from 1-4, with 4 being the highest score and then we graph our findings against similar and competing companies within the market segment.
- Is there a set of approved and published information security policies used by the organization?
- Has an individual who has final responsibility for information security been designated?
- Are security responsibilities clearly defined across teams (i.e., distributed vs completely centralized)?
- Are the organization’s security objectives and goals shared and aligned across the organization?
- Has an ongoing security awareness and training program for all employees been implemented?
- Is a complete inventory of all data assets maintained with owners designated?
- Has a data categorization system been established and classified in terms of legal/regulatory requirements (PCI, HIPAA, SOX, etc.), value, sensitivity, etc.?
- Has an access control policy been established which allows users access only to network and network services required to perform their job duties?
- Are the access rights of all employees and external party users to information and information processing facilities removed upon termination of their employment, contract or agreement?
- Is multi-factor authentication used for access to systems where the confidentiality, integrity or availability of data stored has been deemed critical or essential?
- Is access to source code restricted to only those who require access to perform their job duties?
- Are the development and testing environments separate from the production/operational environment (i.e., they don’t share servers, are on separate network segments, etc.)?
- Are network vulnerability scans run frequently (at least quarterly) and vulnerabilities assessed and addressed based on risk to the business?
- Are application vulnerability scans (penetration tests) run frequently (at least annually or after significant code changes) and vulnerabilities assessed and addressed based on risk to the business?
- Are all data classified as sensitive, confidential or required by law/regulation (i.e., PCI, PHI, PII, etc.) encrypted in transit?
- Is testing of security functionality carried out during development?
- Are rules regarding information security included and documented in code development standards?
- Has an incident response plan been documented and tested at least annually?
- Are encryption controls being used in compliance with all relevant agreements, legislation and regulations? (i.e., data in use, in transit and at rest)
- Do you have a process for ranking and prioritizing security risks?
Subscribe to the AKF Newsletter
Technology Due Diligence Checklists
June 6, 2018 | Posted By: Greg Fennewald
Technology Due Diligence Checklists
What should one examine during a technical due diligence or during a technology due diligence? Are there industry best practices that should be used as a guide? Are here any particular items to request in advance? The answers to these questions are both simple and complex. Yes, there are industry best practices that can be used as a guide. Where to find them and their inherent quality is a more complex question.
AKF partners has conducted countless diligence sessions spanning greater than a decade and we’ve developed lists for information to request in advance and items to evaluate during the due diligence session. These engagements range from very small pre-product seed-round companies to very large investments in companies with revenue greater than $2Bn ARR. We share those lists with you below with a note of caution. A technical due diligence is far more than a list of “yes” or “no” answers, evaluating the responses into a comprehensive measure of technical merit and risk is the secret sauce. The following presents a high level list of technology due diligence questions and requests that can be used as a technology due diligence checklist or template.
10 Things to Request in Advance
1. Architectural overview of the platform and software.
2. Organizational chart of the technology team, including headcount, responsibilities, and any outsourced development or support.
3. Technology team budget and actuals for the previous 3 years. Ideally these should be separated by Gross Margin activities and Operating Margin activities.
4. 3d party audit reports – vulnerability scans, penetrations tests, PCI, HIPAA, ISO certification reports.
5. 3d party software in use including open source software.
6. Current product and technology roadmaps, previous 2 years of major enhancements and releases.
7. Hosting plan – data centers, colocation, and cloud plus disaster recovery and business continuity plans.
8. Previous 3 years of material technology failures.
9. Previous 5 years of security breaches/incidents/investigations.
10. Description of the software development and delivery process.
Areas to Cover During the Due Diligence
The below represent a necessary, but often insufficient, set of technical due diligence questions. These questions are intended to jump start a conversation leading to deeper levels of understanding of a service offering and the maintenance/operations of that service offering. Feel free to use them to start your own diligence program or to augment your existing program. But be careful – having a checklist, does not make one a successful pilot.
Are incoming requests load balanced? If so, how? Is any presistance required for the purposes of session affinity? If so, why?
Are the sessions being stored? If yes, where? How? Why?
Are services separated across different servers? For what purpose?
Are services sized appropriately? What are the bounds and constraints?
Are databases dedicated to a subset of services/data?
Are users segmented into independent pods?
Do services communicate asynchronously?
Do services have dedicated databases? Is any database shared between services?
Are single points of failure eliminated?
Is infrastructure designed for graceful failure? Is software designed for graceful failure? What is the evidence of both?
Is N+1 an architectural requirement at all levels?
Are active-active or multiple AZs utilized? Tested?
Are data centers located in low risk areas?
To what does the company design in terms of both RPO and RTO?
Session and State
Are the solutions completely stateless?
Where, if any place is session stored and why?
Is session required to be replicated between services?
Is session stored in browsers?
Is session stored in databases?
Is auto-scaling enabled?
Is reliance on costly 3rd party software mitigated?
Are stored procedures eliminated in the architecture?
Are servers distributed across different physical or virtual tiers?
Can cloud providers be easily switched?
Is the amount of technical debt quantified?
Is only necessary traffic routed through the firewall?
Are data centers located in low cost areas?
Can the product management team make decisions to alter features?
Are business goals owned by those who enact them?
Are success metrics used to determine when a goal is reached?
Is velocity measured?
Are coding standards documented and applied?
Are unit tests required?
Are feature flags standard?
Is continuous integration used?
Is continuous deployment utilized?
Are payloads smaller and frequent vs larger and seldom?
Can the product be rolled back if issues arise?
Is automated testing coverage greater than 75%?
Are changes being logged and made readily available to engineers?
Is load and performance testing being conducted prior to release?
Are incidents logged with enough detail to ascertain potential problems?
Are alerts sent real time?
Are systems designed for monitoring?
Are user behaviors (logins, downloads, checkouts) used to create business metric monitors?
Is remaining infrastructure headroom known?
Are post mortems conducted and fed back into the system?
Are teams seeded, fed and weeded?
Are teams aligned to the services or products they create?
Are teams cross-functional?
Do team goals aligned to top level business goals?
Do teams sit together?
Are there approved and published security policies?
Are security responsibilities clearly defined?
Does the organization adhere to legal/regulatory requirements as necessary (PCI, HIPAA, SOX, etc)?
Has an inventory of all data assets been conducted and maintained?
Is multi-factor authentication in place?
Are vulnerability scans conducted?
Is a security risk matrix maintained?
Is production system access role based and logged?
AKF has conducted countless due diligence engagements over the last decade. We can take a checklist and add to it context and real world experience to create a finished product that captures the technical risks and merits of a company, improving the quality of your diligence process.
Subscribe to the AKF Newsletter
Technical Due Diligence: Did We Get It Right
April 4, 2018 | Posted By: Geoffrey Weber
Technical Due Diligence: Did We get it Right?
AKF Partners have performed 100s of technical due diligence engagements on behalf of strategic investors, private equity, venture capitalists and others. As such, we have amassed a significant repository of data, of patterns and anti-patterns, and the personal characteristics of successful and unsuccessful executive teams. Moreover, our teams have been on the “other side” of the table countless times as executives for companies being acquired or looking to acquire.
It’s not rare, however, when a potential client asks: how accurate is your technical due diligence? How can I trust that 8 hours with a company is enough to evaluate their technology stack? We love this question!
Due diligence, whether financial or technical is about assessing the risk the investor or acquirer is taking on by investing money into a company. It’s less about trying to identify and measure every gap and more about understanding where significant gaps exist relative to the investment thesis, calculating the risk to which those gaps expose the investor, and estimating the cost of closing the most critical gaps.
Due diligence is remarkably similar to playing poker: not any one player at the table has complete information about the cards remaining in the deck and the cards held by each player. Great poker players combine an ability to calculate probability nearly instantly with respect to the possible combination of cards as well as an ability to read the other players; looking for “tells” that inform the player as to whether their opponent is bluffing or playing with a pair of Aces heading into the “flop.” Great poker players seamlessly combine science and art.
At AKF, we’ve developed a formal checklist of questions around which we build our due diligence practice. In the big picture, this includes evaluating People, Technology, and Process. Our experience suggests that companies cannot build the right technology without the right people in charge, with the right individual contributors displaying the right behaviors to maximize success. Further, building reliable performance and predictable scalability (process, in other words) is of little value if they’re built on top of a weak technology stack.
People >> Technology >> Process
Read more about technical due diligence and AKF Partners here.
So how do we know we’re right?
First and foremost we need to understand the backgrounds, responsibilities and behaviors of the team present in the room with us. A dominate CEO that answers even the most technical questions, shutting out the rest of the team, is a red flag. We might ask specific questions to others and ask the CEO to let us listen to what others at the table have to say. If we’re still road-blocked, then our assessment will be very specific about the behavior of the CEO and we might ask for additional time without the CEO present. Another red flag: a CTO answering a technical question while the VP of Engineering’s face turns purple; an engineering manager choking a piece of paper to death while listening to the CTO review architectural principles or a senior leader refusing eye contact. . . the list of cues is nearly endless. We’ve seen all of them.
Seeing red flags early in an engagement is wonderful, and it allows us time to adjust the agenda on the fly. If we’re hearing something that sounds implausible, we will focus on the area in question and fire off repeated questions. Nothing helps like asking the same question three times in three different ways at three different times during an engagement. We can then judge the quality of the answers by the variety of answers.
Everything is Perfect
The biggest red flag of all: when things are too good. No down time; the architecture scales across all three axes of the AKF Scalability cube without human interaction; obscure or bleeding-edge technologies implemented without significant issues, and, most of all, always on time/under budget. Each of these red flags highlights an area that begs for further digging. This is a good time to start asking for real data. Let’s see the logs in real-time; demonstrate that 2-factor works as described and let’s watch a build. This is a good time to get familiar with Benford’s Law which states “in many naturally occurring collections of numbers, the leading significant digit is likely to be small.” Math is useful here just as it was for Bernie Madoff and just as it is for the poker player seeing a 5th Ace being dealt.
Assessments with significantly dysfunctional teams are a little easier and our assessments reflect this by candidly reporting that we could not validate certain facts or statements due to these dysfunctions. Either we come back and dig more deeply or we meet with a different group of people with the investor.
The Truth about Broken Things