AKF Partners

Abbott, Keeven & Fisher PartnersPartners In Hyper Growth

Category » Uncategorized

AKF Turns 10 – And It’s Still Not About the Tech

The caller ID was blocked but Marty had been expecting the call.  Three “highly connected” people – donors, political advisers and “inner circle” people –  had suggested AKF could help. It was October 2013 and Healthcare.gov had launched only to crash when users tried to sign up. President Obama appointed Jeffrey Zients to mop up the post launch mess. Once the crisis was over, the Government Accountability Office (GAO) released its postmortem citing inadequate capacity planning, software coding errors, and lack of functionality as root causes. AKF’s analysis was completely different – largely because we think differently than most technologists. While our findings indicated the bottlenecks that kept the site from scaling, we also identified failures in  leadership and a dysfunctional organization structure.  These latter, and more important, problems prevented the team from identifying and preventing recurring issues.

We haven’t always thought differently. Our early focus in 2007 was to help companies overcome architectural problems related to scale and availability. We’ve helped our clients solve some of the largest and challenging problems ever encountered – cyber Monday ecommerce purchasing, Christmas day gift card redemption, and April 15th tax filings. But shortly after starting our firm, we realized there was something common to our early engagements that created and sometimes turbocharged the technology failures. This realization, that people and processes – NOT TECHNOLOGY–  are the causes of most failures led us to think differently.  Too often we see technology leaders focusing too much on the technology and not enough on leading, growing, and scaling their teams.

We challenge the notion that technology leaders should be selected and promoted based on their technical acumen. We don’t accept that a technical leader should spend most of her time making the biggest technical decisions.  We believe that technical executives, to be successful, must first be a business executive with great technical and business acumen.  We teach teams how to analyze and successfully choose the appropriate architecture, organization, and processes to achieve a business outcome. Product effort is meaningless without a measurable and meaningful business outcome and we always put outcomes, not technical “religion” first.

If we can teach a team the “AKF way” the chance of project and business success increases dramatically. This may sound like marketing crap (did we mention we are also irreverent?), but our clients attest to it.  This is what Terry Chabrowe, CEO eMarketer, said about us:

AKF served as our CTO for about 8 months and helped us make huge improvements in virtually every area related to IT and engineering. Just as important, they helped us identify the people on our team who could move into leadership positions. The entire AKF team was terrific. We’d never have been able to grow our user base tenfold without them.

A recent post claimed that 93% of successful companies abandon their original strategy.  This is certainly true for AKF. Over the past 10 years we’ve massively changed our strategy of how we “help” companies. We’ve also quadrupled our team size, worked with over 350 companies, written three books, and most importantly made some great friendships. Whether you’ve read our books, engaged with our company, or connected with us on social media, thanks for an amazing 10 years. We look forward to the next 10 years, learning, teaching, and changing strategies with you.


1 comment

AKF Interim Leadership Case Study

Read about one of our success stories in which we filled an interim CTO role at a marketing subscription company in New York.

AKF Long-term Case Study


Comments Off on AKF Interim Leadership Case Study

Big Data Anti-Patterns

Recent research by Gartner indicates that while Big Data investment continues to grow, the intent to invest in Big Data projects is finally showing signs of tapering. While this is a natural part of the hype-cycle, poor ROI on Big Data projects continues to impact the industry. Gartner sites the effective “lack of business leadership” around Big Data initiatives as a primary cause. We’ve kept a watchful eye on these Big Data trends to better serve our clients.

In a recent presentation, Marty Abbott addresses the struggles of getting Big Data Analytics right. There he identifed seven anti-patterns that hamper the proper implementation of Big Data initiatives and provided suggested remedies for each.

Anti-Pattern #1 — Assuming you know the answers.

By setting out to confirm a particular hypothesis, you end up building an analytics system hard-wired to answer a narrow set of questions. This lack of extensibility and blinds you from achieving deeper insights.

Gain greater insights by focusing on correlations without presumption and adopt an analytics process that follows continuous cycle induction, hypothesis, deduction, and data creation. (see hermeneutic cycle.)

Anti-Pattern #2 — No QA in analytics.

Big Data analytics systems are just as prone to bugs and defects as production systems, if not more so. All too often Analytics systems are designed for perfect execution.

Improve QA by building a pre-processing, cleaning stage in your data flow while retaining raw immutable data elsewhere. Maintain prior results and implement the ability to run analytics over your entire dataset to reverify results.

Anti-Pattern #3 — Using production Engineers to run your analytics.
Big Data involves different skills sets, different technologies, and, perhaps most importantly, a different mindset from production operations. Data scientists and analytics experts are explorers who theorize correlations, while production developers are operationalists, making pragmatic decisions to keep systems running.

Improve effectiveness by creating a separate team focused on implementing analytics architecture.

Anti-Pattern #4 — Using OLTP Datastore for Analytics

Using your transactional datastore for analytics will impact the performance of your operational systems. Furthermore, transactional and analytics data have different requirements for normalization and response time latency.

Separate analytics from transactional systems by using a separate Operational Datastore and ingesting data from your application in an asynchronous fashion.

Anti-Pattern #5 — Ignoring Performance Requirements

Performance matters. The longer you take to produce results the more out of date they become. Furthermore, no one can leverage results if they can’t quickly and easily access them.

Improve analytics performance using cluster computing technologies such as Apache’s Spark and Hadoop. Store frequently queried results in a Datamart for quick access.

Anti-Pattern #6 — OLTP is the only input to your Big Data system

If transactional data is the only input to your analytics you’re missing out on the much larger picture.

Customer interactions, social media scrapes, stock performance, Google analytics data, and transaction logs should be brought together to form a comprehensive picture.

Anti-Pattern #7 — Anyone can run their own Analytics

A well-built analytics system can become the victim of it’s own success. As it becomes more popular among stakeholders, overall performance may tank.

To preserve performance, move aggregated data out of analytics processing into OLAP/Datamarts and create a chargeback system allocating resources to users/departments.


Comments Off on Big Data Anti-Patterns

Why Airline On-Time Reliability Is Not Likely To Improve

As a frequent traveler who can exceed 250K air miles per year, I experience flight delays almost every week. Waiting for a delayed flight recently allowed me time to think about how airlines experience the multiplicative effect of failure just like our systems do. If you’re not familiar with this phenomenon, think back to holiday lights that your parents put up. If any one bulb burnt out none of the string of lights worked. The reason was that all the 50 lights in the string were wired in series, requiring all of them to work or none of them worked. Often our systems are designed such that they require many servers, databases, network devices, etc. to work for every single request. If any device or system is unavailable in that request-response chain, the entire process fails. The problem with this is that if five devices each have 99.99% availability, the total availability of the overall system is that availability raised to the power of the number of devices, i.e. 99.99%^5 = 99.95%. This has become even more of a problem with the advent of micro-services (see our paper on how to fix this problem, http://reports.akfpartners.com/2016/04/20/the-multiplicative-effect-of-failure-of-microservices/). Now, let’s get back to airlines.

Airlines spend a lot on airplanes, the average Boeing 737-900 costs over $100M (http://www.boeing.com/company/about-bca/index.page%23/prices), and they want to make use of those assets by keeping them in the air and not on the ground. To accomplish this, airlines schedule airplanes on back-to-back trips. For example, a recent United flight departing SFO for SAN was being serviced by an Airbus A320 #4246. This aircraft had the following schedule for the day.

Depart Arrive
SFO 8:05am SAN 9:44am
SAN 10:35am SFO 12:15pm
SFO 1:20pm SEA 3:28pm
SEA 4:18pm SFO 6:29pm
SFO 7:30pm SAN 9:05pm

If I’m on the last flight of the day (SFO->SAN), each previous flight has to be “available” (on-time) in order for me to have any chance of my request (flight) being serviced properly (on-time). As you can undoubtedly see, this is aircraft schedule (SFO->SAN->SFO->SEA->SFO->SAN) is very similar to a request-response chain for a website (browser->router->load balancer->firewall->web server->app server->DB). The availability of the entire system is the multiplied availability of each component. Is it any wonder why airlines have such dismal availability? See the chart below for Aug 2016, with El Al coming in at a dismal 36.81%, my preferred airline, United, coming in at 77.88%, and the very best KLM at only 91.07%.

airline_ontime(Source: http://www.flightstats.com/company/monthly-performance-reports/airlines/).

So, how do you fix this? In our systems, we scale by splitting our systems by either common attributes such as customers (Z-axis) or different attributes such as services (Y-axis). In order to maintain high availability we ensure that we have fault isolation zones around the splits. Think of this as taking the string of 50 lights, breaking it into 5 sets of 10 lights and ensuring that no set affects another set. This allows us to scale up by adding sets of 10 lights while maintaining or even improving our availability.

Airlines could achieve improved customer perceived on-time reliability by essentially splitting their flights by customers (Z-axis). Instead of large airplanes such as the Boeing 737-900 with 167 passengers, they could fly two smaller planes such as the Embraer ERJ170-200 each with only 88 passengers. When one plane is delayed only half the passengers are affected. Airlines are not likely to do this because in general the larger airplanes have a lower cost per seat than smaller airplanes. The other way to increase on-time reliability would be to fault isolate the flight segments. Instead of stacking flights back-to-back with less than one hour between arrival and the next scheduled departure, increase this to a large enough window to makeup for minor delays. Alternatively, at hub airports, introduce a swap aircraft into the schedule to break the dependency chain. Both of these options are unlikely to appeal to airlines because they would also increase the cost per seat.

The bottom line is that there are clearly ways in which airlines could increase their on-time performance but they all increase costs. We advise our clients, that they need the amount of performance and availability that improves customer acquisition or retention but no more. Building more scalability or availability into the product than is needed takes resources away from features that serve the customer and the business better. Airlines are in a similar situation, they need the amount of on-time reliability that doesn’t drive customers to competitors but no more than that. Otherwise the airlines would have to either increase the price of tickets or reduce their profits. Given the dismal on-time reliability, I just might be willing to pay more for a better on-time reliability.


Comments Off on Why Airline On-Time Reliability Is Not Likely To Improve

Achieving Maximum Availability in the Cloud

We often hear clients confidently tell us that Amazon’s SLA of 99.95% for EC2 instances provides them with plenty of availability. They believe that the combination of auto scaling compute instances with a persistent data store such as RDS provides all the scalability and availability that they will ever need. This unfortunately isn’t always the case. In this article we will focus on AWS services but this applies to any IaaS or PaaS provider.

The SLA of 99.95% is not the guaranteed availability of your services. It’s the availability of the EC2 for an entire region in AWS. From the AWS EC2 SLA agreement,

“Monthly Uptime Percentage” is calculated by subtracting from 100% the percentage of minutes during the month in which Amazon EC2 or Amazon EBS, as applicable, was in the state of “Region Unavailable.”

Even this guaranteed limited downtime by Amazon of 21.56 minutes per month is not always achieved. Over the course of the last few years there have been several outages that lasted much longer. Additionally, your services availability can be impacted from other third parties services used in your product and an even more likely from simple human error by your engineering team.

To combat and reduce the likely of a customer-impacting event or performance degradation, we recommend various deployment patterns that will help.

  • Calls made across Regions should be done so asynchronously. This reduces latency and the likelihood of a failure.
  • A given Service should be completely deployed within multiple Availability Zone.
  • Use abstraction layers with AWS Managed Services so that your product is not tied to such services. Today these managed services are at different stages of maturity and an abstraction layer will allow for a migration to more robust solutions later.
  • Services, or microservices depending on your architecture, should have a dedicated data store. Allowing app server pools to communicate to multiple data stores reduces availability.
  • To protect against region failures deploy all services needed for a product into a second region and run active / active designating a home for each user.

We have come across clients who have invested in migrating from one cloud provider to another because of what seemed like infrastructure outages in a particular region. Before making such an investment of time and money make sure you are not the root cause of these outages because the way your product is architected. If you are concerned about your architecture or deploying your product in the cloud, contact us and we can provide you with some options.


Comments Off on Achieving Maximum Availability in the Cloud

Three Strategies for Migrating to the Cloud

Cloud-deployments are becoming increasingly popular across all industries. Despite initial concerns around vendor lock-in and the need for increased security in multi-tenant environments, many companies have found a strong business case for migration. Overall, the economics aren’t surprising. Cloud providers can leverage huge economies of scale, discounts on power-consumption, and demand diversification to drive down costs, which are then passed on to a fast-growing customer base.

As a technology leader, you’ve done your own research, and you’re confident that a cloud deployment can both save money and promote faster innovation amongst your development team. The next challenge is to develop a clear migration plan; however, what you’ll quickly learn is that the technical and business realities of no two tech companies are alike and no single cloud migration strategy can be relied upon to fit every circumstance.

We’ve laid out three commons scenarios that you might consider for your product. Each describes a different business case and an accompanying cloud migration pattern.

Service-by-Service Migration

In the first scenario you already have a well-architected solution running in a private (company-owned or managed host) datacenter. Your main production application is separated into different services and unburdened by large amounts of technical debt. Here a service-by-service transition into the cloud is the clear choice for smooth transition that doesn’t have nearly the risk of a “big-bang” approach and doesn’t trigger a request for proposal (RFP) event for customers. If you make your customers live through a massive migration, they are often going to take the opportunity to look around for alternative services – thus the RFP.

You’ll want to start with the smallest or least heavily coupled service first. The goal being to build your team’s confidence and experience with cloud deployments and tools before tackling larger challenges.

In most cases your engineers will likely need to get their hands dirty with a little re-architecting. (Depending on how closely you’ve followed our recommendations regarding fault isolation zones or “swim lanes”, many of these steps may have already been completed.) Your team’s first goal should be to rewrite (or eliminate) any cross-service calls to be asynchronous. Once this is complete, except in rare cases, virtualizing the web & app tiers should be a straightforward process. Often, the greater challenge is disentangling the persistent data-tier. If you already have a separate data store for each service, it should be easy to complete service migration by virtualizing or importing the data to a PaaS solution (RDS, Azure’s SQL, etc.). If that’s not the case, you’ll want to either:

(1) Separate the service’s data (choosing an appropriate SQL or NoSQL technology). Sharding the persistent data-tier has the added benefit of reducing load on the primary datastore, reducing size and I/OPS requirements, thereby further easing the transition to the cloud.

or

(2) Eliminate service’s need for data storage entirely by passing data back to another service. (Less desirable, but suitable for smaller services).

Once you’ve deployed your first cloud service. Redirect a small portion of the customer base and monitor for correct functionality. As monitoring improves confidence in the service’s operation, you can continue a slow rollout over your entire customer base, minimize both risk and operational downtime. Repeat this process as for each service to complete the migration.

Down Market MVP

Another common scenario is that your company runs an older mature product, but one that is loaded with technical debt. Perhaps it’s poorly architected, written using older frameworks or languages, or contains undocumented or “black box” code that even your seasoned engineers are afraid to touch. The sheer complexity screams re-write, but at the same time there’s a pressing business appetite to move into the cloud quickly.

In this case, a possible strategy is to enter the cloud with a down-market, minimum viable product (MVP). Have your team start from scratch and develop an MVP in the cloud that encompasses the core features of your main offering. By targeting the down-market, you’re not only entering quickly (with a minimal feature set) but you’re also blocking other upstart competitors from overtaking you.

During development, you’ll need to balance the business demands of time-to-market vs. the technical risk of vendor lock-in. Using PaaS offerings such as Amazon’s SQS or Azure’s DocumentDB can help to get in your new product up-and-running quickly, but at the expense of greater vendor lock-in. There’s no, silver bullet with these decisions, just be sure to make these choices consciously and recognize your accumulating what could become tech debt should you decided to change cloud providers down the road.

Once you’ve established this MVP with a small customer-base, continue to articulate it with feature additions, possibly in a tiered pricing model (e.g. standard, professional, premium). As the new cloud product matures, development efforts should slowly pivot. Developers are moved from the older legacy product to help further articulate features on the new. At the same time, as the new cloud product matures and has more articulated features, customers will be increasing highly incentivized to migrate to the new platform.

Private to Hybrid to Public

In this scenario your company has large capital infrastructure investments that you can’t immediately walk away from. Perhaps you have one or more privately owned data centers, a long-term contract with a managed host provider, or recently purchased expensive new servers during a recent tech-refresh. Cloud migration is a long-term goal for eventual cost savings, but you need to realize the value on infrastructure that has already been purchased.

The pattern in this case is a crawl-walk-run strategy from the private, to hybrid, to public cloud. Initial focus will be on changing your internal processes to run your datacenter(s) as its own mini-cloud. By adopting technology and process changes such as virtualization, resource pooling, automated provisioning, (and possibly even chargeback accounting), you’re readying your organization for an eventual transition to the public cloud.

Once you’ve established your own private cloud, you’ll want to leverage a public cloud provider for your data-backups and DR/Business continuity plan. This is where you’re likely to see your first reduction in CAPEX. Quarterly DR drills will help you test the viability and compatibility of systems your cloud provider.

As demand from your customer base increases, you’ll avoid making further infrastructure investments. Instead, you’ll want to adopt a hybrid cloud strategy, moving a handful of customers to the cloud or leveraging bursting into the cloud to accommodate usage spikes.

Finally, as servers are progressively retired or when your managed provider contracts run out, you’ll be ready to complete a smooth migration fully into the public cloud.

Conclusion

While each of these patterns are vastly different, they all share one thing in common — there are no “Big Bang” moments, no high-risk events that’ll have your stakeholders nail-biting on the edge of there seats. The best technologists are always looking to meet business objectives while reducing risk.

The business and technical realities for every company are different and it’s likely you’ll need to combine one or more of these migration patterns in your own strategy. While we’re confident that many of our customers will benefit from migrating to the cloud, it’s important to tailor your strategy to the needs of your business. We continue to develop new migration patterns that minimize risk and generate business value for our clients.


Comments Off on Three Strategies for Migrating to the Cloud

Guest Post: Three Mistakes in Scaling Non-Relational Databases

The following guest post comes from another long time friend and business associate Chris LaLonde.  Chris was with us through our eBay/PayPal journey and joined us again at both Quigo and Bullhorn.  After Bullhorn, Chris and co-founders Erik Beebe and Kenny Gorman (also eBay and Quigo alums) started Object Rocket to solve what they saw as a serious deficiency in cloud infrastructure – the database.  Specifically they created the first high speed Mongo instance offered as a service.  Without further ado, here’s Chris’ post – thanks Chris!

Three Mistakes in Scaling Non-Relational Databases

In the last few years I’ve been lucky enough to work with a number of high profile customers who use and abuse non-relational databases (mostly MongoDB) and I’ve seen the same problems repeated frequently enough to identify them as patterns. I thought it might be helpful to highlight those issues at a high level and talk about some of the solutions I’ve seen be successful.

Scale Direction:

At first everyone tries to scale things up instead of out.  Sadly that almost always stops working at some point. So generally speaking you have two alternatives:

  • Split your database – yes it’s a pain but everyone gets there eventually. You probably already know some of the data you can split out; users, billing, etc.Starting early will make your life much simpler and your application more scalable down the road. However the number of people that don’t consider that they’ll ever need a 2nd or a 3rd database is frightening. Oh and one other thing, put your analytics some place else; the fewer things placing load on your production database from the beginning the better. Copying data off of a secondary is cheap.  
  • Scale out – Can you offload heavy reads or writes ?  Most non-relational databases ’s have horizontal scaling functionality built in (e.g. sharding in MongoDB).  Don’t let all the negative articles fool you, these solutions do work. However you’ll want an expert on your side to help in picking the right variable or key to scale against ( e.g shard keys in MongoDB ). Seek advice early as these decisions will have a huge impact on future scalability.

Pick the right tool:

Non-Relational databases are very different by design and most “suffer” from some of the “flaws” of the eventually consistent model.  My grandpa always said “use the right tool for the right job”  in this case that means if you’re writing a application or product that requires your data to be highly consistent you probably shouldn’t use a non-relational database. You can make most modern non-relational databases more consistent with configuration and management changes but almost all lack any form of ACID compliance   Luckily in the modern world databases are cheap; pick  several  and get someone to manage them for you, always use the right tool for the right job. When you need consistency  use an ACID  compliant database,   when you need raw speed  use an in-memory data store,  and when you need flexibility use a non-relational database .

Write and choose good code:

Unfortunately not all database drivers are created equal. While I understand that you should write code in the language you’re strongest in sometimes it pays to be flexible. There’s a reason why Facebook writes code in PHP, transpiles it to C++, and then compiles it into a binary for production. In the case of your database the driver is _the_ interface to your database, so if the interface is unreliable or difficult to deal with or not updated frequently, you’re going to have a lot of bad days.  If the interface is stable, well documented and is frequently updated,  you’ll have a lot time to work on features instead of fixes.  Make sure to take a look at the communities around each driver, look at the number of bugs reported and how quickly those bugs are being fixed.  Another thing about connecting to your database: please remember nothing is perfect so write code as if it’s going to fail. At some point in time some component, the network, the NIC, load balancer, a switch or the database itself crashing, is going to cause your application to _not_ be able to talk to your database. I can’t tell you how many times I’ve talked to or heard of a developer assuming that “the database is always up, it’s not my  responsibility” and that’s exactly the wrong assumption. Until the application knows that the data is written assume that it isn’t. Always assume that there’s going to be a problem until you get an “I’ve written the data” confirmation from the database, to assume otherwise is asking for data loss.   

These are just a few quick pointers to help guide folks in the right direction. As always the right answer is to get advice from an actual expert about your specific situation.


Comments Off on Guest Post: Three Mistakes in Scaling Non-Relational Databases

Definition of MVP

We often use the term minimum viable product or MVP but do we all agree on what it means? In the Scrum spirt of Definition of Done, I believe the Definition of MVP is worth stating explicitly within your tech team. A quick search revealed these three similar yet different definitions:

  • A minimum viable product (MVP) is a development technique in which a new product or website is developed with sufficient features to satisfy early adopters. The final, complete set of features is only designed and developed after considering feedback from the product’s initial users. Source: Techopedia
  • In product development, the minimum viable product (MVP) is the product with the highest return on investment versus risk…A minimum viable product has just those core features that allow the product to be deployed, and no more. Source: Wikipedia
  • When Eric Ries used the term for the first time he described it as: A Minimum Viable Product is that version of a new product which allows a team to collect the maximum amount of validated learning about customers with the least effort. Source: Leanstack

I personally like a combination of these definitions. I would choose something along the lines of:

A minimum viable product (MVP) has sufficient features to solve a problem that exists for customers and provides the greatest return on investment reducing the risk of what we don’t know about our customers.

Just like no two teams implement Agile the same way, we don’t all have to agree on the definition of MVP but all your team members should agree. Otherwise, what is an MVP to one person is a full featured product to another. Take a few minutes to discuss with your cross-functional agile team and come to a decision on your Definition of MVP.


Comments Off on Definition of MVP

Guest Post – Effective CTOs

The following guest post comes from long time friend and former colleague Michael “Skippy” Wilson.  Michael was one of eBay’s first employees, and its first CTO.  He was responsible first for taking “eBay 1.0” (Pierre Omidyar’s PERL scripts) and converting them into the modern site written in C with which most people are familiar.  He and his architecture team subsequently redesigned the site with multiple service oriented splits (circa 2000) to enable it to scale to what was then one of the largest transaction sites on the internet.  There was no “playbook” back then for how to scale something as large as eBay.  Michael was one of the early CTOs who had to earn his internet PhD from the school of hard knocks.

Effective CTOs

The excellent blog post This is How Effective CTOs Embrace Change talks about how Twilio CTO Evan Cooke views the evolution of a CTO through a company’s growth.

While this article does an excellent job of identifying the root causes and problems a CTO can run into in– especially working with his fellow C-Level colleagues – I think there is another problem it does not address.

Ironically, one of a CTO’s biggest challenges is actually technology itself and how the CTO manages the company’s relationship to it. These challenges manifest themselves in the following ways:

Keeping appropriate pace with technological change

When a company is young, it’s agile enough to adopt the latest technologies quickly; in fact many companies change technologies several times over as they grow.

Later, when (as the article says) the CTO starts saying “No” to their business partners, they may also react to the company’s need for stability and predictability by saying “No” too often to technological change.  It only takes a good outage or release slippage, triggered by some new technology for the newly minted CTO to go from being the “Yes” man to the “No” man, sacrificing agility at the altar of stability and predictability.

The same CTO who advocated changing languages, database back ends, and even operating systems and hardware platforms while the company was growing finds himself saying “No” to even the most innocuous of upgrades. While the initial result may be more stability and predictability, the end result is far from that: the company’s platform ossifies, becomes brittle and is eventually unable to adapt to the changing needs of the organization.

For example, years ago, before the domination of open-source Unix variants like OpenBSD and Linux, many companies “grew up” on proprietary systems like Solaris, Microsoft, and AIX, alongside database counterparts like Oracle, and Sybase.  While they were “growing up” open source systems matured, offering cost, technology, and even stability benefits over the proprietary solutions. But, in many cases, these companies couldn’t take advantage of this because of the perceived “instability” of these crazy “Open Source” operating systems. The end result was that competitors who could remain agile gained a competitive advantage because they could advance their technology.

Another (glaring) example was the advent of mobile. Most established companies completely failed to get on the mobile bandwagon soon enough and often ended up ceding their positions to younger and more agile competitors because they failed to recognize and keep up with the shift in technology

The problem is a lot like Real Life ™. How do you continue to have the fun you had in your 20s and 30s later on, when things like career and family take center stage? Ironically, the answer is the same: Moderation and Separation.

Moderation means just what it says: use common sense release and deployment planning to introduce change at a predictable – and recoverable – rate. Changing to a new database, a new hardware platform, or even a new back end OS isn’t a big change at all, as long as you find way to do it in an incremental and recoverable matter. In other words, you can get out of it before anyone notices something went wrong.

Separation means you build into the organization the ability to experiment and advance all the time.  While you front end production systems may advance at a mature pace, you still need to maintain Alpha and Beta systems where new technologies can be advanced, experimented with and exposed to (willing) end users. By building this into your organization as a “first class” citizen, the CTO keeps the spirit of agility and technological advance alive, while still meeting the needs of stability and predictability.

Making sure Technology keeps pace with the Organization

The best way to describe this problem is through example: Lots of today’s start-ups are built on what are called “NoSQL” database platforms. NoSQL databases have the advantages of being very performant, in some cases very scalable, and, depending on who you ask, very developer friendly. It’s very clear that lots of companies wouldn’t be where they are without the likes of MongoDB and Couchbase.

But, as many companies have found out, not all NoSQL solutions are created equal, and the solution selected in the company’s early years may not be appropriate as the company grows and it’s needs change.

For example, as the organization matures, parts of the organization will start asking for reports, they may find that while their NoSQL solution worked well as an OLTP solution, it doesn’t work as well for OLAP or Data Warehousing needs. Or, a NoSQL solution that worked well for warehousing data, doesn’t work as well when you give your customers the ability to query it on-line, in an OLTP-like fashion.

This can occur when the CTO isn’t able to help guide the company’s technology forward fast enough to keep pace with the changing organizational needs. Unfortunately, if the problem reaches the “critical” stage because, for example, the organization can’t get reports, the solution may become a politically charged hack job instead of a well-planned solution.

Every organization – engineering, database administration, operations, etc, will want to “solve” the problem as quickly as possible – so while the end solution may solve the immediate problem, it probably won’t be the one you’d choose if you’d planned ahead.

The solution here is for the CTO to continuously engage with the Company to understand and anticipate its upcoming technology needs, and engage with Engineering organizations to plan for them. It’s actually better for the CTO to arrange to engage the stakeholders and engineering organizations together rather than serially: it encourages the stakeholders to work together instead of through the CTO.

Making sure Technology keeps pace with Security requirements

Today’s engineers have an amazing community of Open Source Developers and Tools to draw on to get their work done.

And when a company is growing, tools like NPM, grunt, brew, and Hadoop are amazing tools. In many cases, they enabled development to move exponentially more quickly than they could have otherwise.

Unfortunately, many of these tools are gigantic security risks.  When an engineer types “grunt install” (a node-related command), or “brew update”, do they really know exactly it is doing? Do they know where it’s getting the update? Do they know if the updates can be trusted?

They don’t. Let’s not even going to go into the horrors they’re inadvertently inviting behind your firewall.

The problem for the CTO is that they probably had a hand in introducing these tools and behaviors to the organization, and will now be faced with reining them in. “Fixing” them will probably make them harder for engineers to use, and feel like a hassle instead of a common-sense security measure.

The CTO can expect the same situation when charged with guiding the organization towards things like always-on HTTPS, better encryption-in-place of customer data, etc. Not only will they have to convince their engineering organizations it’s important, they may also have to convince the rest of the organization that these measures deserve a place in line along with other product features.

One solution to this is a backward-looking one: Security should be part of the product and company’s culture from Day 1.  Not just protecting the company’s assets from unauthorized access, but protecting the customer’s assets (data) just as vigilantly. A culture like that would recognize the two examples above and have either solved them, or be willing to solve them quickly and without easily without politics getting in the way.

If the CTO’s organization wasn’t built that way, their job becomes much harder, since they now need to change the company’s culture. Fortunately, recent events like the Sony hack and Snowden’s revelations make this much easier than it was 2 years, or even a year ago.

Conclusion

The effective CTO faces at least two challenges: Becoming a effective part of the company’s executive team (not an “outside advisor”), and an effective part of the company’s technology and product teams. The CTO can’t forget that their job is to continue to keep the company on the appropriate forefront changing technology – far enough ahead to keep pace, but not so far ahead as to be “bleeding edge”. Almost universally the way to do that is to make technological change part of the company’s culture – not a special or one-off event.

–Michael Wilson


Comments Off on Guest Post – Effective CTOs

Just for Fun – The Many Meanings of “I Don’t Disagree”

OK.  Who was the knucklehead who thought up the ridiculous corporatism of “I don’t disagree”?

Seriously.  What does this mouth turd even mean?

Does it mean that you simply don’t want to commit to a position?  Do you need more time to determine a position?  Is your ego too large to admit that someone else is right?  Do you really disagree but are too spineless to say it?  Were you not listening and needed to spout some statement to save face?

We owe it to each other to speak clearly, to engage in meaningful dialogue about important topics, and to convey our thoughts and positions.  When political correctness overrides fiduciary responsibility, we destroy stakeholder value.  Get rid of this piece of mouth trash from your repertoire.


Comments Off on Just for Fun – The Many Meanings of “I Don’t Disagree”