AKF Partners

Abbott, Keeven & Fisher PartnersPartners In Hyper Growth

AKF is launching an Information Security Service for Clients

In today’s digital world, cyber security is one of the biggest areas that keep CEOs and business executives awake at night. Further, regulation and compliance are continually changing, as both governments and customers are modifying the rules in an attempt to keep pace with rapid changes in technology. Keeping up with these changes in regulation, and particularly how they will apply to your business, is a monumental challenge. As a very recent example, if you operate in both the U.S. and Europe, you’re probably familiar with Safe Harbor (https://en.wikipedia.org/wiki/International_Safe_Harbor_Privacy_Principles). Up until mid 2015, Safe Harbor permitted U.S. companies to store EU customer data on U.S. soil, so long as the infrastructure and usage met the privacy standards of the Safe Harbor Act. However, in October 2015, courts struck down the US-EU Safe Harbor agreement, leaving companies again wondering what approach they should take. Then on Feb 29, 2016, the European Commission introduced the EU-US Privacy Shield (http://europa.eu/rapid/press-release_IP-16-433_en.htm), which is similar to Safe Harbor but, in conjunction with the Judicial Redress Act signed by President Obama, extends the ability of European citizens to challenge U.S. companies that store sensitive personal information. With the changes coming quickly and furiously, what is your company to do?

SecPic1

In addition to those challenges, customers today put more and more demand on companies to not only protect sensitive customer data and IP, but many customers DO expect you to store and process sensitive data in a highly available and scalable manner. Balancing the needs of maintaining data on behalf of customers with regulation and compliance is one of the biggest challenges for any modern company, whether you store or process sensitive data for your customers or not. Sometimes just operating in a regulated industry subjects your company to the long reaches of compliance standards.

Finally, who is in charge of security in your company? Is it spread across teams? Is there a single CSO/CISO and security organization? How well do the security organizations work with your technology and business teams?

Our clients ask us frequently if we can assess their security programs and help develop plans for them, similar to the work we do with system architecture and product development. At AKF, we look at security the same way we look at availability and scaling. It is about managing risk. You can build a system that has nearly 100% availability, and you can build a system that scales nearly infinitely… but you don’t always need to, as it’s not always the most cost-effective way to run your business.

SecPic2

You want to build a system that achieves very high availability, up to the point where the cost of “near perfect” availability exceeds the value it brings to the business. Similarly, you may never be able to completely eliminate each and every risk to your information security. Some industries, like health care and credit card processors, need to be “near perfect.” But other may not need to be quite as perfect. Some may just require that you have ample controls, tightly monitored systems, secure coding practices, and top notch Security Incident Response plans.

Finally, there are plenty of overlaps in regulations that if applied once, can solve several of your security compliance concerns. For example, PCI, SOX, and HIPAA all have requirements about audibility and accessibility, although they apply to different types of data. If you’re subject to more than one of those, why not put controls in place that help solve them for ANY regulation?

SecPic3 SecPic4 SecPic5

We have begun a new program to help our clients with their security needs. Our approach is to help you understand your regulation and compliance landscape, look at the security measures you’ve put in place, and work with you to design a program and projects to close the gaps, focusing on the highest value projects first. We’ll also help you understand where security fits into your organization, and guide you on how to break down barriers that prevent organizations from working cohesively to manage security across the enterprise.

If you are interested in our security program services, please contact us.

Send to Kindle

Comments Off on AKF is launching an Information Security Service for Clients

Paying Down Your Technical Debt

During the course of our client engagements, there are a few common topics or themes that are almost always discussed, and the clients themselves usually introduce them. One such area is technical debt. Every team has it, almost every team believes they have too much of it, and no team has an easy time explaining to the business why it’s important to address. We’re all familiar with the concept of technical debt, and we’ve shared a client horror story in a previous blog post that highlights the dangers of ignoring it: Technical Debt

TDPicture1

When you hear the words “technical debt”, it invokes a negative connotation. However, the judicious use of tech debt is a valuable addition to your product development process. Tech debt is analogous to financial debt. Companies can raise capital to grow their business by either issuing equity or issuing debt. Issuing equity means giving up a percentage of ownership in the company and dilutes current shareholder value. Issuing debt requires the payment of interest, but does not give up ownership or dilute shareholder value. Issuing debt is good, until you can’t service it. Once you have too much debt and cannot pay the interest, you are in trouble.

Tech debt operates in the same manner. Companies use tech debt to defer performing work on a product. As we develop our minimum viable product, we build a prototype, gather feedback from the market, and iterate. The parts of the product that didn’t meet the definition of minimum or the decisions/shortcuts made during development represent the tech debt that was taken on to get to the MVP. This is the debt that we must service in later iterations. Taking on tech debt early can pay big dividends by getting your product to market faster. However, like financial debt, you must service the interest. If you don’t, you will begin to see scalability and availability issues. At that point, refactoring the debt becomes more difficult and time critical. It begins to affect your customers’ experience.

TDPicture2

Many development teams have a hard time convincing leadership that technical debt is a worthy use of their time. Why spend time refactoring something that already “works” when you could use that time to build new features customers and markets are demanding now? The danger with this philosophy is that by the time technical debt manifests itself into a noticeable customer problem, it’s often too late to address it without a major undertaking. It’s akin to not having a disaster recovery plan when a major availability outage strikes. To get the business on-board, you must make the case using language business leaders understand – again this is often financial in nature. Be clear about the cost of such efforts and quantify the business value they will bring by calculating their ROI. Demonstrate the cost avoidance that is achieved by addressing critical debt sooner rather than later – calculate how much cost would be in the future if the debt is not addressed now. The best practice is to get leadership to agree and commit to a certain percentage of development time that can be allocated to addressing technical debt on an on-going basis. If they do, it’s important not to abuse this responsibility. Do not let engineers alone determine what technical debt should be paid down and at what rate – it must have true business value that is greater than or equal to spending that time on other activities.

TDPicture3

Additionally, be clear about how you define technical debt so time spent paying it down is not comingled with other activities. Generally, bugs in your code are not technical debt. Refactoring your code base to make it more scalable, however, would be. A good test is to ask if the path you chose was a conscious or unconscious decision. Meaning, if you decided to go in one direction knowing that you would later need to refactor. This is also analogous to financial debt; technical debt needs to be a conscious choice. You are making a specific decision to do or not to do something knowing that you will need to address it later. Bugs are found in sloppy code, and that is not tech debt, it is just bad code.

So how do you decide what tech debt should be addressed and how do you prioritize? If you have been tracking work with Agile storyboards and product backlogs, you should have an idea where to begin. Also, if you track your problems and incidents like we recommend, then this will show elements of tech debt that have begun to manifest themselves as scalability and availability concerns. We recommend a budget of 12-25% of your development efforts in servicing tech debt. Set a budget and begin paying down the debt. If you are working on less than the lower range, you are not spending enough effort. If you are spending over 25%, you are probably fixing issues that have already manifested themselves, and you are trying to catch up. Setting an appropriate budget and maintaining it over the course of your development efforts will pay down the interest and help prevent issues from arising.

Taking on technical debt to fund your product development efforts is an effective method to get your product to market quicker. But, just like financial debt, you need to take on an appropriate amount of tech debt that you can service by making the necessary interest and principle payments to reduce the outstanding balance. Failing to set an appropriate budget will result in a technical “bankruptcy” that will be much harder to dig yourself out of later.

Send to Kindle

Comments Off on Paying Down Your Technical Debt

The Biggest Problem with Big Data

How to Reduce Crime in Big Cities

I’m a huge Malcolm Gladwell fan.  Gladwell’s ability to convey complex concepts and virtually incomprehensible academic research in easily understood prose is second to none within his field of journalism.  A perfect example of his skill is on display in the Tipping Point, where Gladwell wrestles the topic of Complexity Theory (aka Chaos Theory) into submission, making it accessible to all of us.  In the Tipping Point, Gladwell also introduces us to The Broken Windows Theory.

The Broken Windows Theory gets its name from a 1982 The Atlantic Monthly article.  This article asked the reader to imagine a building with a few broken windows.  The authors claim that the existence of these windows invite vandals to break still more windows.  A continuous cycle of expanding vandalism ensues, with squatters moving in, nearby buildings getting vandalized, etc.  Subsequent authors expanded upon the theory, claiming that the presence of vandalism invites other crimes and that crime rates soar in communities where unhandled vandalism is present.  A corollary to the Broken Windows Theory is that cities can reduce crime rates by focusing law enforcement on petty crimes.  Several high profile examples seem to illustrate the power and correctness of this theory, such as New York Mayor Giuliani’s “Zero Tolerance Program”.  The program focused on vandalism, public drinking, public urination, and subway fare evasion.  Crime rates dropped over a 10 year period, corresponding with the initiation of the program.  Several other cities and other experiments showed similar effects.  Proof that the hypothesis underpinning the theory is correct.

Not So Fast…

Enter the self-described “Rogue Economist” Stephen Levitt and his co-author Stephen Dubner – both of Freakonomics fame.  While the two authors don’t deny that the Broken Windows theory may explain some drop in crime, they do cast significant doubt on the approach as the primary explanation for crime rates dropping.  Crime rates dropped nationally during the same 10 year period in which New York pursued its Zero Tolerance Program.  This national drop in crime occurred in cities that both practiced Broken Windows and those that did not.  Further, crime rate dropped irrespective of either an increase or decrease in police spending.  The explanation therefore, argue the authors, cannot primarily be Broken Windows.  The most likely explanation and most highly correlated variable is a reduction in a pool of potential criminals.   Roe v. Wade legalized abortion, and as a result there was a significant decrease in the number of unwanted children, a disproportionately high percentage of whom would grow up to be criminals.

Gladwell isn’t therefore incorrect in proffering Broken Windows as an explanation for reduction in crime.  But the explanation is not the best one available and as a result it holds residence somewhere between misleading (worst case) and incomplete (best case).

What Happened?

To be fair, it’s hard to hold Gladwell accountable for this oversight.  Gladwell is not a scientist and therefore not trained in how to scientifically evaluate the research he reported.  Furthermore, his is an oft repeated mistake even among highly trained researchers.  And what exactly is that mistake?  The mistake made here is illustrated by the difference in approach between the Broken Windows researchers and the Freakonomics authors.  The Broken Windows researchers started with something like the following question “Does the presence of vandalism invite additional vandalism and escalating crime?”  Levitt and Dubner first asked the question “What variables appear to explain the rate of crime?”

Broken Windows started with a question focused on deductive analysis.  Deduction starts with a hypothesis – “Evidence of vandalism and/or other petty crimes invites similar and more egregious crimes”.   The process continues to attempt to confirm or disprove the hypothesis.  Deduction starts with a broad and abstract view of the data – a generalization or hypothesis as to relationships – and attempts to move to show specific relationships between data elements.  The Broken Windows folks started with a hypothesis, developed a series of experiments to test the hypothesis and then ultimately evaluated time series data in cities with various Broken Windows approaches to policing.  What they lacked was a broad question that may have developed a range of options indicating possible causes.

The Freakonomics authors started with an inductive question.  Induction is the process of moving from specific observations about data into generalizations.  These generalizations are often in the form of hypothesis or models as to how data interacts.  Induction helps to inform what questions should be asked of the data.  Induction is the asking of “What change in what independent variables seem to correspond with a resulting change in some dependent variable?”  Whereas deduction works from independent variable to dependent variable, induction attempts to work backwards from dependent variable to identify independent variable relationships.

So What?

The jump to deduction, without forming the right questions and hypotheses through induction, is the biggest mistake we see in developing Big Data programs and implementing Big Data solutions.   We all approach problems with unique experiences and unique biases.  The combination of these often cause us to race to hypotheses and want to test them.  The issue here is two-fold. The best case is that we develop an incomplete (and as a result partially or mostly incorrect) answer similar to that of The Broken Windows researchers.  The worst case is that we suffer what statisticians call a Type 1 error – confirming an incorrect answer.  The probability of type 1 errors increases when we don’t look for alternative or better answers for outcomes within our data sets.  Induction helps to uncover those alternative or supporting explanations.  Exploring the data to discover potential relationships helps us to ask the right questions and form better hypotheses and better models.  Skipping induction makes it highly probable that we will get an incorrect, misleading or substandard answer.

But it is not enough to simply ensure that we practice both induction and deduction.  We must also recognize that the solutions that support induction are different from those that support deduction.  Further, we must understand that the two processes while complimentary can actually interfere with each other when performed on the same system.  Induction is necessarily a very broad and as a result slow and tedious process.  Deduction, on the other hand, needs significantly less data and “prefers” to be faster in implementation.  Inductive systems are best supported by solutions that impose very few relations or structure on the data we observe.  Systems that support deduction, in order to allow for faster response times, impose increased structure relative to inductive systems.  While the two phases of discovery (Induction and Deduction) support each other, their differences suggest that they should be performed on solutions purpose built to their specific needs.

Similarly, not everyone is equally qualified to perform both induction and deduction.  Our experience is that the folks who tend to be good at determining how to prove relationships between variables are often not as good at identifying patterns and vice versa.

These two observations, that the systems that support induction and deduction should be separated and that the people performing these tasks may need to be different, have ramifications to how we develop our analytics systems and organize our Big Data teams.  We’ll discuss these ramifications and more in our next post, “10 Anti-Patterns within Big Data”.

Send to Kindle

Comments Off on The Biggest Problem with Big Data

Data Driven Product Development

When we bring up the idea of data driven product development, almost every product manager or engineering director will quickly say that of course they do this. Probing further we usually hear about A/B testing, but of course they only do it when they aren’t sure what the customer wants. To all of this we say “hogwash!” Testing one landing page against another is a good idea, but it’s a far cry from true data driven product development. The bottom line is that if your agile teams don’t have business metrics (e.g. increase the conversion rate by 5% or decrease the shopping cart abandonment rate by 2%) that they measure after every release to see if they are coming closer to achieving, then you are not data driven. Most likely you are building products based on the HiPPO. The Highest Paid Person’s Opinion. In case that person isn’t always around, there’s this service.
hippo

Let’s break this down. Traditional agile processes have product owners seeking input from multiple sources (customers, stakeholders, marketplace). All these diverse inputs are considered and eventually they are combined into a product vision that helps prioritize the backlog. In the most advanced agile processes, the backlog is ranked based on effort and expected benefit. This is almost never done. And if it is done, the effort and benefit are never validated afterwards to see how accurate the estimates where. We even have an agile measurement for doing this for the effort, velocity. Most agile teams don’t bother with the velocity measurement and have no hope of measuring the expected benefit.

Would we approach anything else like this? Would we find it acceptable if other disciplines worked this way? What if doctors practiced medicine with input from patients and pharmaceutical companies but without monitoring how their actions impacted the patients’ health? We would probably think that they got off to a good start, understanding the symptoms from the patient and the possible benefits of certain drugs from the pharmaceutical representatives. But, they failed to take into account the most important part of the process. How their treatment was actually performing against their ultimate goal, the improved health of the patient.
doctor

We argue that you’re wasting time and effort doing anything that isn’t measured by a success factor. Why code a single line if you don’t have a hypothesis that this feature, enhancement, story, etc. will benefit a customer and in turn provide greater revenue for the business. Without this feedback mechanism, your effort would be better spent taking out the trash, watering the plants, and cleaning up the office. Here’s what we know about most people’s gut instinct with regards to products, they stink. Read, Lund’s very first Usability Maxims “Know thy user, and YOU are not thy user.” What does that mean? It means that you know more about your product, industry, service, etc. than any user ever will. Your perspective is skewed because you know too much. Real users don’t know where everything is on the page, they have to search for it. Real users lie when you ask them what they want. Real users have bosses, spouses, and children distracting them while they use your product. As Henry Ford supposedly said, “If I had asked customers what they wanted, they would have said a faster horse.” You can’t trust yourself and you can’t trust users. Trust the data. Form a hypothesis “If I add a checkout button here, shoppers will convert more.” Then add the button and watch the data.
checkout_button

This is not A/B or even multivariate testing. It is much more than that. It’s changing the way to develop products and services. Remember, Agile is a business process, not just a development methodology. Nothing gets shipped without an expected benefit that is measured. If you didn’t achieve the business benefit that was expected you either try again next sprint or you deprecate the code and start over. Why leave code in a product that isn’t achieving business results? It only serves to bloat the code base and make maintenance harder. Data driven product development means we don’t celebrate shipping code. That’s suppose to happen every two weeks or even daily if we’re practicing continuous delivery. You haven’t achieved anything pushing code to production. When the customer interacts with that code and a business result is achieved then we celebrate.

Data driven product development is how you explain the value of what the product development team is delivering. It’s also how you ensure your agile teams are empowered to make meaningful impact for the business. When the team is measured against real business KPIs that directly drive revenue, it raises the bar on product development. Don’t get caught in the trap of measuring success with feature releases or sprint completions. The real measurement of success is when the business metric shows the predicted improvement and the agile team becomes better informed of what the customers really want and what makes the business better.

Send to Kindle

Comments Off on Data Driven Product Development

Guest Post: Three Mistakes in Scaling Non-Relational Databases

The following guest post comes from another long time friend and business associate Chris LaLonde.  Chris was with us through our eBay/PayPal journey and joined us again at both Quigo and Bullhorn.  After Bullhorn, Chris and co-founders Erik Beebe and Kenny Gorman (also eBay and Quigo alums) started Object Rocket to solve what they saw as a serious deficiency in cloud infrastructure – the database.  Specifically they created the first high speed Mongo instance offered as a service.  Without further ado, here’s Chris’ post – thanks Chris!

Three Mistakes in Scaling Non-Relational Databases

In the last few years I’ve been lucky enough to work with a number of high profile customers who use and abuse non-relational databases (mostly MongoDB) and I’ve seen the same problems repeated frequently enough to identify them as patterns. I thought it might be helpful to highlight those issues at a high level and talk about some of the solutions I’ve seen be successful.

Scale Direction:

At first everyone tries to scale things up instead of out.  Sadly that almost always stops working at some point. So generally speaking you have two alternatives:

  • Split your database – yes it’s a pain but everyone gets there eventually. You probably already know some of the data you can split out; users, billing, etc.Starting early will make your life much simpler and your application more scalable down the road. However the number of people that don’t consider that they’ll ever need a 2nd or a 3rd database is frightening. Oh and one other thing, put your analytics some place else; the fewer things placing load on your production database from the beginning the better. Copying data off of a secondary is cheap.  
  • Scale out – Can you offload heavy reads or writes ?  Most non-relational databases ’s have horizontal scaling functionality built in (e.g. sharding in MongoDB).  Don’t let all the negative articles fool you, these solutions do work. However you’ll want an expert on your side to help in picking the right variable or key to scale against ( e.g shard keys in MongoDB ). Seek advice early as these decisions will have a huge impact on future scalability.

Pick the right tool:

Non-Relational databases are very different by design and most “suffer” from some of the “flaws” of the eventually consistent model.  My grandpa always said “use the right tool for the right job”  in this case that means if you’re writing a application or product that requires your data to be highly consistent you probably shouldn’t use a non-relational database. You can make most modern non-relational databases more consistent with configuration and management changes but almost all lack any form of ACID compliance   Luckily in the modern world databases are cheap; pick  several  and get someone to manage them for you, always use the right tool for the right job. When you need consistency  use an ACID  compliant database,   when you need raw speed  use an in-memory data store,  and when you need flexibility use a non-relational database .

Write and choose good code:

Unfortunately not all database drivers are created equal. While I understand that you should write code in the language you’re strongest in sometimes it pays to be flexible. There’s a reason why Facebook writes code in PHP, transpiles it to C++, and then compiles it into a binary for production. In the case of your database the driver is _the_ interface to your database, so if the interface is unreliable or difficult to deal with or not updated frequently, you’re going to have a lot of bad days.  If the interface is stable, well documented and is frequently updated,  you’ll have a lot time to work on features instead of fixes.  Make sure to take a look at the communities around each driver, look at the number of bugs reported and how quickly those bugs are being fixed.  Another thing about connecting to your database: please remember nothing is perfect so write code as if it’s going to fail. At some point in time some component, the network, the NIC, load balancer, a switch or the database itself crashing, is going to cause your application to _not_ be able to talk to your database. I can’t tell you how many times I’ve talked to or heard of a developer assuming that “the database is always up, it’s not my  responsibility” and that’s exactly the wrong assumption. Until the application knows that the data is written assume that it isn’t. Always assume that there’s going to be a problem until you get an “I’ve written the data” confirmation from the database, to assume otherwise is asking for data loss.   

These are just a few quick pointers to help guide folks in the right direction. As always the right answer is to get advice from an actual expert about your specific situation.

Send to Kindle

Comments Off on Guest Post: Three Mistakes in Scaling Non-Relational Databases

When Should You Split Services?

The Y axis of the AKF Scale Cube indicates that growing companies should consider splitting their products along services (verb) or resources (noun) oriented boundaries. A common question we receive is “how granular should one make a services split?” A similar question to this is “how many swim lanes should our application be split into?” To help answer these questions, we’ve put together a list of considerations based on developer throughput, availability, scalability, and cost. By considering these, you can decide if your application should be grouped into a large, monolithic codebases or split up into smaller individual services and swim lanes. You must also keep in mind that splitting too aggressively can be overly costly and have little return for the effort involved. Companies with little to no growth will be better served focusing their resources on developing a marketable product than by fine tuning their service sizes using the considerations below.

 

Developer Throughput:

Frequency of Change – Services with a high rate of change in a monolithic codebase cause competition for code resources and can create a number of time to market impacting conflicts between teams including product merge conflicts. Such high change services should be split off into small granular services and ideally placed in their own fault isolative swim lane such that the frequent updates don’t impact other services. Services with low rates of change can be grouped together as there is little value created from disaggregation and a lower level of risk of being impacted by updates.

The diagram below illustrates the relationship we recommend between functionality, frequency of updates, and relative percentage of the codebase. Your high risk, business critical services should reside in the upper right portion being frequently updated by small, dedicated teams. The lower risk functions that rarely change can be grouped together into larger, monolithic services as shown in the bottom left.

frequency_v_functionality

Degree of Reuse – If libraries or services have a high level of reuse throughout the product, consider separating and maintaining them apart from code that is specialized for individual features or services. A service in this regard may be something that is linked at compile time, deployed as a shared dynamically loadable library or operate as an independent runtime service.

Team Size – Small, dedicated teams can handle micro services with limited functionality and high rates of change, or large functionality (monolithic solutions) with low rates of change. This will give them a better sense of ownership, increase specialization, and allow them to work autonomously. Team size also has an impact on whether a service should be split. The larger the team, the higher the coordination overhead inherent to the team and the greater the need to consider splitting the team to reduce codebase conflict. In this scenario, we are splitting the product up primarily based on reducing the size of the team in order to reduce product conflicts. Ideally splits would be made based on evaluating the availability increases they allow, the scalability they enable or how they decrease the time to market of development.

Specialized Skills – Some services may need special skills in development that are distinct from the remainder of the team. You may for instance have the need to have some portion of your product run very fast. They in turn may require a compiled language and a great depth of knowledge in algorithms and asymptotic analysis. These engineers may have a completely different skillset than the remainder of your code base which may in turn be interpreted and mostly focused on user interaction and experience. In other cases, you may have code that requires deep domain experience in a very specific area like payments. Each of these are examples of considerations that may indicate a need to split into a service and which may inform the size of that service.

 

Availability and Fault Tolerance Considerations:

Desired Reliability – If other functions can afford to be impacted when the service fails, then you may be fine grouping them together into a larger service. Indeed, sometimes certain functions should NOT work if another function fails (e.g. one should not be able to trade in an equity trading platform if the solution that understands how many equities are available to trade is not available). However, if you require each function to be available independent of the others, then split them into individual services.

Criticality to the Business – Determine how important the service is to business value creation while also taking into account the service’s visibility. One way to view this is to measure the cost of one hour of downtime against a day’s total revenue. If the business can’t afford for the service to fail, split it up until the impact is more acceptable.

Risk of Failure – Determine the different failure modes for the service (e.g. a billing service charging the wrong amount), what the likelihood and severity of each failure mode occurring is, and how likely you are to detect the failure should it happen.   The higher the risk, the greater the segmentation should be.

 

Scalability Considerations:

Scalability of Data – A service may be already be a small percentage of the codebase, but as the data that the service needs to operate scales up, it may make sense to split again.

Scalability of Services – What is the volume of usage relative to the rest of the services? For example, one service may need to support short bursts during peak hours while another has steady, gradual growth. If you separate them, you can address their needs independently without having to over engineer a solution to satisfy both.

Dependency on Other Service’s Data – If the dependency on another service’s data can’t be removed or handled with an asynchronous call, the benefits of disaggregating the service probably won’t outweigh the effort required to make the split.

 

Cost Considerations:

Effort to Split the Code – If the services are so tightly bound that it will take months to split them, you’ll have to decide whether the value created is worth the time spent. You’ll also need to take into account the effort required to develop the deployment scripts for the new service.

Shared Persistent Storage Tier – If you split off the new service, but it still relies on a shared database, you may not fully realize the benefits of disaggregation. Placing a read-only DB replica in the new service’s swim lane will increase performance and availability, but it can also raise the effort and cost required.

Network Configuration – Does the service need its own subdomain? Will you need to make changes load balancer routing or firewall rules? Depending on the team’s expertise, some network changes require more effort than others. Ensure you consider these changes in the total cost of the split.

 

 

The illustration below can be used to quickly determine whether a service or function should be segmented into smaller microservices, be grouped together with similar or dependent services, or remain in a multifunctional, infrequently changing monolith.

decision_matrix

Send to Kindle

Comments Off on When Should You Split Services?

The Downside of Stored Procedures

I started my engineering career as a developer at Sybase, the company producing a relational database going head-to-head with Oracle. This was in the late 80’s, and Sybase could lay claim to a very fast RDBMS with innovations like one of the earliest client/server architectures, procedural language extensions to SQL in the form of Transact-SQL, and yes, stored procedures. Fast forward a career, and now as an Associate Partner with AKF, I find myself encountering a number of companies that are unable to scale their data tier, primarily because of stored procedures. I guess it is true – you reap what you sow.

Why does the use of stored procedures (sprocs) inhibit scalability? To start with, a database filled with sprocs is burdened by the business logic contained in those sprocs. Rather than relatively straightforward CRUD statements, sprocs seem to typically become the main application ‘tier’, containing 100s of lines of SQL within one sproc. So, in addition to persisting and retrieving your precious data, maintaining referential integrity and transactional consistency, you are now asking for your database to run the bulk of your application as well. Too many eggs in one basket.

And, that basket will remain a single basket, as it is quite difficult to horizontally scale a database filled with sprocs. Out of the 3 axes described in the AKF scalability cube, the only axis that is an option to readily achieve horizontal scalability with is the Z axis, where you have divided or partitioned your data across multiple shards. The X axis, the classic horizontal approach, works only if you are able to replicate your database across multiple read-only replicas, and can segregate read-only activity out of your application, difficult to do if many of those reads are coming from sprocs that also write. Additionally, most applications built on top of a sproc-heavy system have no DAL, no data access layer that might control or route read access to a different location than writes. The Y axis, dividing your architecture up into services, is also difficult to do in a sproc-heavy environment, as these environments are often extremely monolithic, making it very difficult to separate sprocs by service.

Your database hardware is typically the beefiest, most expensive box you will have. Running business logic on it vs. on smaller, commodity boxes typically used for an application tier is simply causing the cost of computation to be inordinately high. Also, many companies will deploy object caching (e.g. memcached) to offload their database. This works fine when the application is separate from the database, but when the reads are primarily done in sprocs? I’d love to see someone attempt to insert memcached into their database.

Sprocs seem to be something of a gateway drug, where once you have tasted the flavor of application logic living in the database, the next step is to utilize abilities such as CLR (common language runtime). Sure, why not have the database invoke C# routines? Why not ask your database machine to also perform functions such as faxing? If Microsoft allows it, it must be good, right?

Many of the companies we visit that are suffering from stored procedure bloat started out building a simple application that was deployed in a turnkey fashion for a single customer with a small number of users. In that world, there are certainly benefits to using sprocs, but that world was 20th century and no longer viable in today’s world of cloud and SaaS models.

Just say no to stored procedures!

Send to Kindle

Comments Off on The Downside of Stored Procedures

Definition of MVP

We often use the term minimum viable product or MVP but do we all agree on what it means? In the Scrum spirt of Definition of Done, I believe the Definition of MVP is worth stating explicitly within your tech team. A quick search revealed these three similar yet different definitions:

  • A minimum viable product (MVP) is a development technique in which a new product or website is developed with sufficient features to satisfy early adopters. The final, complete set of features is only designed and developed after considering feedback from the product’s initial users. Source: Techopedia
  • In product development, the minimum viable product (MVP) is the product with the highest return on investment versus risk…A minimum viable product has just those core features that allow the product to be deployed, and no more. Source: Wikipedia
  • When Eric Ries used the term for the first time he described it as: A Minimum Viable Product is that version of a new product which allows a team to collect the maximum amount of validated learning about customers with the least effort. Source: Leanstack

I personally like a combination of these definitions. I would choose something along the lines of:

A minimum viable product (MVP) has sufficient features to solve a problem that exists for customers and provides the greatest return on investment reducing the risk of what we don’t know about our customers.

Just like no two teams implement Agile the same way, we don’t all have to agree on the definition of MVP but all your team members should agree. Otherwise, what is an MVP to one person is a full featured product to another. Take a few minutes to discuss with your cross-functional agile team and come to a decision on your Definition of MVP.

Send to Kindle

Comments Off on Definition of MVP

Guest Post – Effective CTOs

The following guest post comes from long time friend and former colleague Michael “Skippy” Wilson.  Michael was one of eBay’s first employees, and its first CTO.  He was responsible first for taking “eBay 1.0” (Pierre Omidyar’s PERL scripts) and converting them into the modern site written in C with which most people are familiar.  He and his architecture team subsequently redesigned the site with multiple service oriented splits (circa 2000) to enable it to scale to what was then one of the largest transaction sites on the internet.  There was no “playbook” back then for how to scale something as large as eBay.  Michael was one of the early CTOs who had to earn his internet PhD from the school of hard knocks.

Effective CTOs

The excellent blog post This is How Effective CTOs Embrace Change talks about how Twilio CTO Evan Cooke views the evolution of a CTO through a company’s growth.

While this article does an excellent job of identifying the root causes and problems a CTO can run into in– especially working with his fellow C-Level colleagues – I think there is another problem it does not address.

Ironically, one of a CTO’s biggest challenges is actually technology itself and how the CTO manages the company’s relationship to it. These challenges manifest themselves in the following ways:

Keeping appropriate pace with technological change

When a company is young, it’s agile enough to adopt the latest technologies quickly; in fact many companies change technologies several times over as they grow.

Later, when (as the article says) the CTO starts saying “No” to their business partners, they may also react to the company’s need for stability and predictability by saying “No” too often to technological change.  It only takes a good outage or release slippage, triggered by some new technology for the newly minted CTO to go from being the “Yes” man to the “No” man, sacrificing agility at the altar of stability and predictability.

The same CTO who advocated changing languages, database back ends, and even operating systems and hardware platforms while the company was growing finds himself saying “No” to even the most innocuous of upgrades. While the initial result may be more stability and predictability, the end result is far from that: the company’s platform ossifies, becomes brittle and is eventually unable to adapt to the changing needs of the organization.

For example, years ago, before the domination of open-source Unix variants like OpenBSD and Linux, many companies “grew up” on proprietary systems like Solaris, Microsoft, and AIX, alongside database counterparts like Oracle, and Sybase.  While they were “growing up” open source systems matured, offering cost, technology, and even stability benefits over the proprietary solutions. But, in many cases, these companies couldn’t take advantage of this because of the perceived “instability” of these crazy “Open Source” operating systems. The end result was that competitors who could remain agile gained a competitive advantage because they could advance their technology.

Another (glaring) example was the advent of mobile. Most established companies completely failed to get on the mobile bandwagon soon enough and often ended up ceding their positions to younger and more agile competitors because they failed to recognize and keep up with the shift in technology

The problem is a lot like Real Life ™. How do you continue to have the fun you had in your 20s and 30s later on, when things like career and family take center stage? Ironically, the answer is the same: Moderation and Separation.

Moderation means just what it says: use common sense release and deployment planning to introduce change at a predictable – and recoverable – rate. Changing to a new database, a new hardware platform, or even a new back end OS isn’t a big change at all, as long as you find way to do it in an incremental and recoverable matter. In other words, you can get out of it before anyone notices something went wrong.

Separation means you build into the organization the ability to experiment and advance all the time.  While you front end production systems may advance at a mature pace, you still need to maintain Alpha and Beta systems where new technologies can be advanced, experimented with and exposed to (willing) end users. By building this into your organization as a “first class” citizen, the CTO keeps the spirit of agility and technological advance alive, while still meeting the needs of stability and predictability.

Making sure Technology keeps pace with the Organization

The best way to describe this problem is through example: Lots of today’s start-ups are built on what are called “NoSQL” database platforms. NoSQL databases have the advantages of being very performant, in some cases very scalable, and, depending on who you ask, very developer friendly. It’s very clear that lots of companies wouldn’t be where they are without the likes of MongoDB and Couchbase.

But, as many companies have found out, not all NoSQL solutions are created equal, and the solution selected in the company’s early years may not be appropriate as the company grows and it’s needs change.

For example, as the organization matures, parts of the organization will start asking for reports, they may find that while their NoSQL solution worked well as an OLTP solution, it doesn’t work as well for OLAP or Data Warehousing needs. Or, a NoSQL solution that worked well for warehousing data, doesn’t work as well when you give your customers the ability to query it on-line, in an OLTP-like fashion.

This can occur when the CTO isn’t able to help guide the company’s technology forward fast enough to keep pace with the changing organizational needs. Unfortunately, if the problem reaches the “critical” stage because, for example, the organization can’t get reports, the solution may become a politically charged hack job instead of a well-planned solution.

Every organization – engineering, database administration, operations, etc, will want to “solve” the problem as quickly as possible – so while the end solution may solve the immediate problem, it probably won’t be the one you’d choose if you’d planned ahead.

The solution here is for the CTO to continuously engage with the Company to understand and anticipate its upcoming technology needs, and engage with Engineering organizations to plan for them. It’s actually better for the CTO to arrange to engage the stakeholders and engineering organizations together rather than serially: it encourages the stakeholders to work together instead of through the CTO.

Making sure Technology keeps pace with Security requirements

Today’s engineers have an amazing community of Open Source Developers and Tools to draw on to get their work done.

And when a company is growing, tools like NPM, grunt, brew, and Hadoop are amazing tools. In many cases, they enabled development to move exponentially more quickly than they could have otherwise.

Unfortunately, many of these tools are gigantic security risks.  When an engineer types “grunt install” (a node-related command), or “brew update”, do they really know exactly it is doing? Do they know where it’s getting the update? Do they know if the updates can be trusted?

They don’t. Let’s not even going to go into the horrors they’re inadvertently inviting behind your firewall.

The problem for the CTO is that they probably had a hand in introducing these tools and behaviors to the organization, and will now be faced with reining them in. “Fixing” them will probably make them harder for engineers to use, and feel like a hassle instead of a common-sense security measure.

The CTO can expect the same situation when charged with guiding the organization towards things like always-on HTTPS, better encryption-in-place of customer data, etc. Not only will they have to convince their engineering organizations it’s important, they may also have to convince the rest of the organization that these measures deserve a place in line along with other product features.

One solution to this is a backward-looking one: Security should be part of the product and company’s culture from Day 1.  Not just protecting the company’s assets from unauthorized access, but protecting the customer’s assets (data) just as vigilantly. A culture like that would recognize the two examples above and have either solved them, or be willing to solve them quickly and without easily without politics getting in the way.

If the CTO’s organization wasn’t built that way, their job becomes much harder, since they now need to change the company’s culture. Fortunately, recent events like the Sony hack and Snowden’s revelations make this much easier than it was 2 years, or even a year ago.

Conclusion

The effective CTO faces at least two challenges: Becoming a effective part of the company’s executive team (not an “outside advisor”), and an effective part of the company’s technology and product teams. The CTO can’t forget that their job is to continue to keep the company on the appropriate forefront changing technology – far enough ahead to keep pace, but not so far ahead as to be “bleeding edge”. Almost universally the way to do that is to make technological change part of the company’s culture – not a special or one-off event.

–Michael Wilson

Send to Kindle

Comments Off on Guest Post – Effective CTOs

Just for Fun – The Many Meanings of “I Don’t Disagree”

OK.  Who was the knucklehead who thought up the ridiculous corporatism of “I don’t disagree”?

Seriously.  What does this mouth turd even mean?

Does it mean that you simply don’t want to commit to a position?  Do you need more time to determine a position?  Is your ego too large to admit that someone else is right?  Do you really disagree but are too spineless to say it?  Were you not listening and needed to spout some statement to save face?

We owe it to each other to speak clearly, to engage in meaningful dialogue about important topics, and to convey our thoughts and positions.  When political correctness overrides fiduciary responsibility, we destroy stakeholder value.  Get rid of this piece of mouth trash from your repertoire.

Send to Kindle

Comments Off on Just for Fun – The Many Meanings of “I Don’t Disagree”