December 14, 2017 | Posted By: Marty Abbott
The Law that Almost Wasn’t
Conway’s law had a rather precarious beginning. Harvard Business Review rejected Conway’s thesis, buried as it was in the 43d paragraph of a 45-paragraph paper, on the grounds that he had not proven it.
But Mel had a PhD in Mathematics (from Case Western Reserve University – Go Spartans!), and like most PhDs he was accustomed to journal rejections. Mel resubmitted the paper to Datamation, a well-respected IT journal of the time, and his paper “How Do Committees Invent” was published in 1968.
It wasn’t until 1975, however, that the moniker “Conway’s Law” came to be. Fred Brooks both coined the term and popularized Conway’s thesis in his first edition of the Mythical Man Month. It has since been one of the most widely cited, important but nevertheless incorrectly understood and applied notions in the domain of product development.
Cliff’s Notes to “How Do Committees Invent” (the article in which the law resides)
Conway’s thesis, in his words:
… organizations which design systems (in the broad sense used here) are constrained to produce designs which are copies of the communication structures of these organizations.
Conway calls this self-similarity between organizations and designs homomorphism. Preamble to the thesis helps explain the breadth and depth:
… the very act of organizing a design team means that certain design decisions have already been made, explicitly or otherwise
Every time a delegation is made … the class of design alternatives which can be effectively pursued is also narrowed.
Because the design which occurs first is almost never the best possible, the prevailing system concept may need to change. Therefore, flexibility of organization is important to effective design.
Specifically, each individual must have at most one superior and at most approximately seven subordinates
Examples. A contract research organization had eight people who were to produce a COBOL and an ALGOL compiler. After some initial estimates of difficulty and time, five people were assigned to the COBOL job and three to the ALGOL job. The resulting COBOL compiler ran in five phases, the ALG0L compiler ran in three.
There are 4 very important points, and one very good example, in the quotes above:
1) Organizations and design/architecture and intrinsically linked. The organization affects and constrains the architecture - the opposite is not true.
2) Depth of an organization negatively effects design flexibility. The deeper the hierarchy of an organization, the less flexible (or alternatively more constrained) the resulting architecture.
3) We will make mistakes and must organize to quickly fix these.
4) Team size should always be small – which also has an implication to the size of the solution part a team can own (think Amazon’s re-branding of this point of the “2 Pizza Team” (author’s side note – read Scalability Rules for how this came about).
Important corollaries to Conway’s law suggest that if either an organization or a design change, without a corresponding change to the other, the product will be at risk.
Common Failures in Application of Conway’s Law and How to Fix Them
There are five very common failures in organization and architecture within our clients, the first four of which relate directly to Conway’s points above:
1) Organizations and architectures designed separately. Given the homomorphism that Conway describes, you simply CANNOT do this.
2) Deep, hierarchical organizations. Again – this will constrain design.
3) Lack of flexibility. Companies tend to plan for success. Instead, assume failure, learning, and adaptation (think “discovery” and “Agile” instead of “requirements” and “Waterfall”).
4) Large teams. Forget about these. Small teams, each owning a service or services that the team can support in isolation.
There is a fifth violation that is harder to see in Conway’s paper. Too often, our clients don’t build properly experienced teams around the solutions they deploy. Success in low-overhead organizations requires that teams be cross functional. Whatever a team needs to be successful should be within that team. If you deploy on your own hardware, you should have hardware experience. If you need DBA talent, the team should have direct access to that talent. QA folks should be embedded within the team, etc. Product managers or owners should also be embedded in the team. This creates our fifth failure:
5) Functional teams. Don’t build teams around “a skill” – build them around the breadth of skills necessary to accomplish the task handed to the team.
Conway’s Parting Shot and Food for Thought
Noodle on this: Conway identified a problem early in the life of a new domain. Yet what was true in Conway’s time as a contributor to the art is still true today, over 50 years after his first attempt to forewarn us:
Probably the greatest single common factor behind many poorly designed systems now in existence has been the availability of a design organization in need of work.
Like this article? Share it with friends here, and subscribe to the newsletter here.
AKF Partners helps companies ensure that their organizations and architectures are aligned to the outcomes they desire. We help companies develop better, more highly available and more highly scalable products with faster time to market and lower cost. Give us a call or shoot us an email. We’d love to help you achieve the success you desire.
Reach out to AKF
November 8, 2017 | Posted By: Bill Armelin
In 1965, psychologist Bruce Tuckman published his theory of group dynamics. This theory describes the stages (or phases) through which a team progresses enroute to optimal productivity. While generally useful for any organization, and prescriptive as to what leaders should do when to boost performance, it has profound impacts to Agile development practices and how we build organizations around these Agile practices.
The first stage is forming. This is where the team first comes together. Here, the individuals are trying to get to know each other. They tend to be polite and cordial, but they do not fully trust each other.
In this stage, the team productivity and team conflict are low. The team spends time agreeing to what the team is supposed to do. This lack of agreement of the team’s purpose can cause members to miss goals because they are individually targeting different things. Team members rely on patterned behavior and look to the team leader for guidance and direction. The team members want to be accepted by the group. Cautious behavior on the part of the team starts to depress overall team outcomes. Good leadership, emphasizing goals and outcomes is important to set the stage for future team behaviors and outcomes.
Once the team’s goals are clear, they move into the next stage, storming. Here, the team starts to develop a plan to achieve the goal and defines what to do and who does it. Friction starts to occur as members propose different ideas. Trust within the team remains low and affective conflict rises as people vie for control. Cliques can form. Productivity drops even lower than in the first stage.
Once the team agrees on the plan and the roles and responsibilities, it can move to the next stage. Without agreement, the team can get stuck. Symptoms include poor coordination, people doing the wrong things and missing deadlines, to name a few. Good leadership here focuses on fast affective conflict resolution, and serves to help reinforce team goals and outcomes in order to quickly move to more productive phases.
Once team members agree to the plan and understand their roles, they enter the norming phase. Affective conflict goes down, cognitive (beneficial) conflict and trust increase. The team focuses on how to get things done and productivity begins to increase. The team develops “norms” about how to work together and collaborate. A lack of these norms can cause issues such as low quality and missed deadlines.
Leadership within the team becomes clear and cliques dissolve. Members begin to identify with one another and the level of trust in their personal relationships contributes to the development of group cohesion. The team begins to experience a sense of group belonging and a feeling of relief from resolving interpersonal conflicts. Team identity starts to take hold and innovation and creativity within the team increases. The members feel an openness and cohesion on both a personal and task level. They feel good about being part of the team.
The final stage, preforming, is not achieved by all teams. This stage is marked by an interdependence in personal relations and problem solving within the realm of the team’s tasks. Team members share a common goal, understand the plan to achieve it, know their roles and how to work together. The team is firing on all cylinders. At this point, the team is highly productive and collaborates well. They are trusting of each other and “have each other’s back.” Healthy conflict is encouraged. There is unity: group identity is complete, group morale is high, and group loyalty is intense.
Not all teams get to this phase. They can get stuck in a previous phase or slide back into them from a higher phase. Leadership that focuses on affective conflict resolution, team identity creation, a compelling vision and goals to achieve that vision is critical to reaching the Performing phase. It is usually not easy for teams to quickly progress through these stages, and it often takes 6 months or more for a team to reach the Performing phase.
Impact to Agile Development
We often see companies make the mistake of coalescing teams around initiatives. Sometimes called “virtual teams” or “matrixed teams”, these teams suffer the underperforming phases of Tuckman’s curve repeatedly, especially when these initiatives are of durations shorter than 6 months. But even with durations of a year, six months of that time is spent getting the team to an optimum level of performance.
Tuckman’s analysis indicates that teams should be together for no less than a year (giving a 6 month return on a 6-month investment) and ideally for about 3 years. The upper limit being informed by the research on group think and its implications to creativity, performance and innovation within teams. Teams then should become semi-permanent and we should seek to move work to teams rather than form teams around work. To be successful here, we need multi-disciplinary teams capable of handling all the work they may get assigned. Further, the team needs to be familiar and “own” the outcomes associated with the solution (or architectural components) with which they work. More on that in future articles discussing Conway’s Law and Empathy Groups.
AKF Partners helps companies understand and apply the extant theory around organizational development in order to turbo-charge engineering performance. Wondering if your engineering productivity decreases as you grow your engineering and product teams? We can help you fix that and get your productivity back to the level it was as a startup!
Reach out to AKF
October 24, 2017 | Posted By: Marty Abbott
North Korea’s recent antics involving ballistic missiles and nuclear weapons are scary. While we seem to be edging ever closer to nuclear war – closer perhaps than any time since the Cuban Missile Crisis – the probability of such an occurrence remains relatively low. Even an apparently irrational head of state such as Kim Jong Un must understand that the use of a nuclear device will turn nearly the entire world against him. The use of a device against any nation would end his reign in relatively short order and end the People’s Republic of North Korea as we know it today. This then begs the question of why Kim Jong Un would participate in such brinkmanship? Many politicians and strategists seem to think it is a strategy to force other nations to recognize the PRNK and reduce the onerous sanctions currently levied against it by the United Nations. Perhaps, but maybe in addition to or instead, Jong Un is trying to take our eyes off the war he has been waging for many years: a cyber war against many nations.
Both cyber warfare on the part of a nation state and cyber terrorism waged by stateless entities aim to attack our economic infrastructure. Both North Korea and terrorists understand that attacking our economy, our businesses and our personal wealth are the most effective methods of causing harm to our nations and their citizens. North Korea is likely behind many recent attacks on financial institutions, has ties to the WannaCry ransomware outbreak, was behind the attack on Sony pictures and was involved in a heist of $81M from the Bangladesh Central Bank. Each of these were likely perpetrated by the formidable PRNK cyber warfare group “Unit 180”.
When not engaging in direct attacks to steal money from or otherwise harm business operations, both terrorists and nation states seek to use the products of a company for nefarious purposes. Recent examples include ISIS using eBay’s marketplace to funnel money to an operative in the US, and Russia purchasing advertising on Facebook in an attempt to influence the US election. Cyber warfare and terrorism are not just threats– they are daily occurrences. The foregoing examples illustrate how the game has changed. The question for you is - has your company changed enough to successfully protect itself against this growing and evolving threat?
The answer for most companies with which we work is “No”. Security organizations seem oblivious to the changing cyber threat. They continue to focus almost exclusively on barrier protection systems and cyber response processes. Few companies outside of the financial sector have developed analytics systems to help identify emerging threats and nefarious activity. Fewer still practice aggressive “patrolling” to identify threats outside of the perimeter of their digital operations. Here are a few questions to help you evaluate whether your company has the mindset necessary to be successful in the world of cyberwarfare and terrorism:
Who means you harm and how do they intend to perpetrate it?
Military veterans know that a successful defense requires more than just “Alamo’ing Up” behind a wall and hunkering down. You must patrol and reconnoiter the surrounding area to understand whence the enemy will come, in what numbers and with what capabilities. If your security team isn’t actively attempting to identify threats outside of your organization - and by this I mean beyond your walls - you are most certainly going to be surprised.
How do you find new and emerging behavior within your product and operations?
Given the threat of using your product for nefarious purposes, how do you identify when new and behaviors or trends emerge? What analytics systems do you have to identify that existing personas or users are acting in new or odd ways? How do you keep an eye on new patterns or trends of usage by both existing and new users? In very high transaction environments, how do you identify the less than 1 basis point of activity that may be nefarious in nature buried within 99.99% of valid transactions? These questions aren’t likely to be answered by a “traditional” security team – they require teams with deep analytic skills and systems dedicated to analytics and machine learning. Similarly, traditional analytics teams may not have the right mindset to seek out nefarious transactions.
Do you have the right people?
This is the most important question of all. You don’t need to fire your CSSP folks – you still have a need for them within your security team. But you also want folks with a proven record of being able to think like and use the tools of cyber criminals, terrorists, and warfare focused nation-states. These folks are unlikely to be willing to wear suits and ties to work, preferring instead to wear shorts and Birkenstocks. The traditional corporate mindset and tools will stand in the way of them being successful on your behalf. They need to use TOR browsers and have access to sites to which you are unlikely to want the remainder of your employees going. The biggest barrier to success here with most companies is fit with a company’s culture – but I can guarantee you that if you don’t have some of these folks on staff you are not going to be successful in this new era of cyber warfare.
How do you fare against the above questions? Are you properly set up to defend your company and your shareholders against the today’s cyber threat? If you are uncertain, reach out to AKF Partners – we’ll evaluate your security infrastructure and approach and help ensure that you can properly defend yourself against the growing threat.
October 15, 2017 | Posted By: Marty Abbott
I’m a huge Malcolm Gladwell fan. Gladwell’s ability to convey complex concepts and virtually incomprehensible academic research in easily understood prose is second to none within his field of journalism. A perfect example of his skill is on display in the Tipping Point, where Gladwell wrestles the topic of Complexity Theory (aka Chaos Theory) into submission, making it accessible to all of us. In the Tipping Point, Gladwell also introduces us to The Broken Windows Theory.
The Broken Windows Theory gets its name from a 1982 The Atlantic Monthly article. This article asked the reader to imagine a building with a few broken windows. The authors claim that the existence of these windows invite vandals to break still more windows. A continuous cycle of expanding vandalism ensues, with squatters moving in, nearby buildings getting vandalized, etc. Subsequent authors expanded upon the theory, claiming that the presence of vandalism invites other crimes and that crime rates soar in communities where unhandled vandalism is present. A corollary to the Broken Windows Theory is that cities can reduce crime rates by focusing law enforcement on petty crimes. Several high profile examples seem to illustrate the power and correctness of this theory, such as New York Mayor Giuliani’s “Zero Tolerance Program”. The program focused on vandalism, public drinking, public urination, and subway fare evasion. Crime rates dropped over a 10 year period, corresponding with the initiation of the program. Several other cities and other experiments showed similar effects. Proof that the hypothesis underpinning the theory is correct.
Not So Fast…
Enter the self-described “Rogue Economist” Stephen Levitt and his co-author Stephen Dubner - both of Freakonomics fame. While the two authors don’t deny that the Broken Windows theory may explain some drop in crime, they do cast significant doubt on the approach as the primary explanation for crime rates dropping. Crime rates dropped nationally during the same 10 year period in which New York pursued its Zero Tolerance Program. This national drop in crime occurred in cities that both practiced Broken Windows and those that did not. Further, crime rate dropped irrespective of either an increase or decrease in police spending. The explanation therefore, argue the authors, cannot primarily be Broken Windows. The most likely explanation and most highly correlated variable is a reduction in a pool of potential criminals. Roe v. Wade legalized abortion, and as a result there was a significant decrease in the number of unwanted children, a disproportionately high percentage of whom would grow up to be criminals.
Gladwell isn’t therefore incorrect in proffering Broken Windows as an explanation for reduction in crime. But the explanation is not the best one available and as a result it holds residence somewhere between misleading (worst case) and incomplete (best case).
To be fair, it’s hard to hold Gladwell accountable for this oversight. Gladwell is not a scientist and therefore not trained in how to scientifically evaluate the research he reported. Furthermore, his is an oft repeated mistake even among highly trained researchers. And what exactly is that mistake? The mistake made here is illustrated by the difference in approach between the Broken Windows researchers and the Freakonomics authors. The Broken Windows researchers started with something like the following question “Does the presence of vandalism invite additional vandalism and escalating crime?” Levitt and Dubner first asked the question “What variables appear to explain the rate of crime?”
Broken Windows started with a question focused on deductive analysis. Deduction starts with a hypothesis - “Evidence of vandalism and/or other petty crimes invites similar and more egregious crimes”. The process continues to attempt to confirm or disprove the hypothesis. Deduction starts with a broad and abstract view of the data – a generalization or hypothesis as to relationships – and attempts to move to show specific relationships between data elements. The Broken Windows folks started with a hypothesis, developed a series of experiments to test the hypothesis and then ultimately evaluated time series data in cities with various Broken Windows approaches to policing. What they lacked was a broad question that may have developed a range of options indicating possible causes.
The Freakonomics authors started with an inductive question. Induction is the process of moving from specific observations about data into generalizations. These generalizations are often in the form of hypothesis or models as to how data interacts. Induction helps to inform what questions should be asked of the data. Induction is the asking of “What change in what independent variables seem to correspond with a resulting change in some dependent variable?” Whereas deduction works from independent variable to dependent variable, induction attempts to work backwards from dependent variable to identify independent variable relationships.
The jump to deduction, without forming the right questions and hypotheses through induction, is the biggest mistake we see in developing Big Data programs and implementing Big Data solutions. We all approach problems with unique experiences and unique biases. The combination of these often cause us to race to hypotheses and want to test them. The issue here is two-fold. The best case is that we develop an incomplete (and as a result partially or mostly incorrect) answer similar to that of The Broken Windows researchers. The worst case is that we suffer what statisticians call a Type 1 error – confirming an incorrect answer. The probability of type 1 errors increases when we don’t look for alternative or better answers for outcomes within our data sets. Induction helps to uncover those alternative or supporting explanations. Exploring the data to discover potential relationships helps us to ask the right questions and form better hypotheses and better models. Skipping induction makes it highly probable that we will get an incorrect, misleading or substandard answer.
But it is not enough to simply ensure that we practice both induction and deduction. We must also recognize that the solutions that support induction are different from those that support deduction. Further, we must understand that the two processes while complimentary can actually interfere with each other when performed on the same system. Induction is necessarily a very broad and as a result slow and tedious process. Deduction, on the other hand, needs significantly less data and “prefers” to be faster in implementation. Inductive systems are best supported by solutions that impose very few relations or structure on the data we observe. Systems that support deduction, in order to allow for faster response times, impose increased structure relative to inductive systems. While the two phases of discovery (Induction and Deduction) support each other, their differences suggest that they should be performed on solutions purpose built to their specific needs.
Similarly, not everyone is equally qualified to perform both induction and deduction. Our experience is that the folks who tend to be good at determining how to prove relationships between variables are often not as good at identifying patterns and vice versa.
These two observations, that the systems that support induction and deduction should be separated and that the people performing these tasks may need to be different, have ramifications to how we develop our analytics systems and organize our Big Data teams. We’ll discuss these ramifications and more in our next post, “10 Anti-Patterns within Big Data”.
September 19, 2017 | Posted By: Greg Fennewald
Everyone was saddened to see the horrific destruction storms caused to Houston and Florida, including deaths and extensive property damage. It seems reasonable that the impact of these hurricanes was lessened by advanced notice and preparation – stockpiling supplies, evacuating the highest risk areas, and staging response resources to assist with recovery and rebuilding.
Data centers operate every day with a similar preparation mindset: diesel generators to provide power should the utility fail, batteries to keep servers running during a transition, potentially stored water or a well to replace municipal water service for cooling systems, and food and water for personnel unable to leave the location.
What happens when a “prepared” location such as a data center encounters a hurricane with strong winds, heavy rain, and extensive flooding? In some cases, the data center survives without impact, although there certainly will be outages and failures. Examples of data centers surviving Harvey in good shape can be seen here, while accounts of the service impacts caused by Hurricane Sandy can be seen here.
Data Center Points of Failure
Let’s examine what may enable a data center to survive without functional impact. Extensive risk investigation goes into site selection for data centers. Data centers are expensive to build with costs measured in the tens or even hundreds of millions of dollars. The potential business impact of a failure can be costly with liquidated damage clauses in hosting contracts. These factors lead to data centers being located outside of flood plains, away from hazardous material routes, and stoutly constructed to endure storm winds likely in the region.
Losing utility power is regarded as a “when” not an “if” in the data center industry (be that an outage or a planned maintenance activity), and diesel generators are a common solution, often with 24 hours or more of fuel on hand and multiple replenishment contracts. Data centers can survive for days/weeks without utility power, and in some cases for months. How could flooding impact power? The service entrance for a data center, where the utility power is routed, is often buried underground. Utility power is likely to be lost during flooding, either from damage due to flooding or intentional actions to prevent damage by shutting down the local grid. A data center would operate on generator if the data center itself is not flooded, although fuel replenishment is not likely. If there are two feet of water in the main electrical room(s), the data center is going dark.
Many large data centers rely on evaporating water to cool the servers it hosts. Evaporative cooling is generally more energy efficient than other options, but introduces an additional risk to operations – water supply. In many locations, municipal water pressure is lost during an extend power outage. Data centers can mitigate this risk by using water storage tanks or water wells onsite. Like diesel generators, the data centers can operate normally for hours or days without municipal water. A data center should be outside the flood plain, able to operate without utility power or municipal water for hours or days, is structurally strong enough to handle the winds of a major storm – is there any other risk to mitigate? Network connectivity and bandwidth.
Most data centers need to communicate with other data centers to fulfill their OLAP or OLTP purpose. Without connectivity, services are not available. Data should be fine, but it is becoming increasingly stale. Transactions and traffic are done. Like utility power, network connections are usually buried. With distance and geographic limitations involved, network pathways may get flooded as may the facilities that aggregate and transmit the data. Telecom facilities generally have generators and other availability measures, but can be forced into less advantageous locations and may have a shorter runtime standard than a data center.
Data centers that are serious about availability generally have carrier diversity and physical pathway diversity to mitigate carrier outages and “backhoe fades”. This may help in the event of widespread flooding as well. The reality is a data center without connectivity is generally useless. All the risk mitigation going into structural design, power and cooling redundancy, and fire protection is moot if connectivity fails.
Preparing for the Inevitable
The best way to mitigate these risks is to not rely on a single data center location. One is none and two is one. Owned, colo, managed hosting, or cloud – be able to survive the loss of a single location. The RTO and RPO of the business will guide the choice of active – active, hot – cold, or data backup with an elastic compute response plan. Hurricanes can cause regional impact, such as Irma disrupting most of Florida. In years past, many companies decided to have two data center within 20 miles of each other to support synchronous data base replication. A primary site in one borough of New York City, and the DR site in a different borough. Replication options and data base management techniques have advanced sufficiently to allow far greater dispersion today. Avoid a regionally impacting event by choosing data centers in diverse regions.
Operating from 3 locations can be cheaper than 2, and can also improve customer satisfaction with reduced response times produced by serving customers from the nearest location. See Rule 12 in Scalability Rules. The ability to operate from multiple locations also enables a choice to adjust the redundancy of those locations. A combination of Tier II and III locations may be a more economical choice than a pair of Tier IV locations.
Developing a hosting plan can be complicated and frustrating, particularly since the core competency of your business is likely not data centers. AKF Partners can help – not only with hosting strategy, but also the product architecture and operational processes needed to weld infrastructure, architecture, and process into a seamless vehicle that delivers services to your clients with availability the market demands.
Hurricanes aren’t the only disasters that can take down your data center. Solar flares, runaway SUVs, civil disruption, tornadoes and localized power outages have all caused data centers to fail. Natural disasters of all types trail equipment failures and human error as causes of service impacting events (source: 365DataCenters). According to FEMA, 40% of businesses that close due to a disaster don’t reopen, and of those that do only 29% are in business two years after the disaster (source: FEMA). Don’t be a statistic. AKF Partners can help you with the product architecture and data center planning necessary to survive nearly any disaster.
Reach out to AKF
September 5, 2017 | Posted By: Roger Andelin
Last month, a bot developed by OpenAI (co-founded by Elon Musk) beat the world’s best, pro Dota 2 players. This is another milestone accomplishment in the field of artificial intelligence and machine learning and more fuel for the fire of concerns surrounding the AI debate. However, before we jump into that debate, here is some background you should understand about the technology fueling this debate.
The Evolution of Traditional Programming
A lot of what computer programming is can be simplified into three steps. First step, read in some data. Second step, do something with that data. Third step, output some result.
For example, imagine you want to fly somewhere for the weekend. You may first go to your travel app and input some dates, times, number of people traveling, airports, etc. Second, the app uses that information to search its database of available flights. Third, it returns a list of available flights for you to see.
This approach to software design has been the norm since the earliest days of programming. Artificial intelligence, in particular machine learning, has changed that approach. The first step is still the same: Read in some data. The third step is the same: Output some result.
However, with artificial intelligence technologies like machine learning, the second step, doing something with the data, is very different. In the example of finding a flight, a programmer easily can read the software code to understand the sequence of steps the computer has been programmed to do to produce the output data. If the programmer wants to change or improve the program’s behavior she can do that by writing new code or by altering the existing code. For example, if you wanted to compare the prices for available flights near the dates you have selected, a programmer can easily change several lines of code in the program to do just that. The programming code identifies every step the computer takes to arrive at its output. Said another way, the program only does what it’s specifically told to do in the code, nothing more or nothing less.
By contrast, the output of today’s most common machine learning programs is not determined by instructions written in computer code. There is no code for a programmer to read or modify when a change is desired. The output is determined by the program’s neural network.
Neural Networks in Action
What is a neural network? At the core of a neural network is a neuron. Similar to a traditional computer program, a neuron takes some input data, does a mathematical calculation on that data and then outputs some data. A typical neuron in a neural network will receive as input hundreds to thousands of numbers, typically between 0 and 1. A neuron will then multiply each number by a weight and sum the result of all the numbers. Many neurons will then convert the result into a number between 0 and 1. That result is then sent to the next neuron in sequence until the final output neuron is reach.
Here is an example of the math a typical neuron will do: If “x1, x2, x3…” represents input data and “w1, w2, w3…” represent the weights stored in the neuron, the calculation done by the neuron in a neural network looks like this: x1*w1 + x2*w2 + x3*w3 and so forth.
You can think of the calculation inside the neuron in a different way: The neuron is reading in a bunch of numbers and the weights in the neuron determine the importance, “or weight” of that input in producing the output. If the input is not important the weight for the input will be near zero and the input is not passed along to the next neuron. Therefore, the weights in a neuron effectively decided what input is valuable and what input should be ignored.
In a neural network, neurons like the one I described above are connected in parallel and in series to create a matrix of neurons. The input data to a neural network will go into hundreds or thousands of neurons in parallel, all with different weights. The output of those neurons is then sent to another layer of neurons and so forth, usually multiple layers deep. This is called a deep neural network. Another way to look at this is the neurons are grouped into a matrix of rows and columns, all interconnected. The final layer of the neural network is the output layer. Therefore, the final output of a neural network is the result of millions of calculations done by the neurons of that network.
When a programmer creates a neural network in software, the weights for each neuron are initially just random numbers. In other words, the weights arbitrarily decide to either diminish, increase or leave the input data alone, and output from the network is random. However, through a process called training, the weights move from randomly assigned values to values that can produce very useful outputs.
Training is both a time consuming and complicated mathematical process. However, it is much like training you and I would do to get better at something. For example, let’s say I wanted to learn how to shoot and arrow with a bow. I might pick up the bow and arrow, point it at the target, pull back the string and release. In my case, I know the arrow would miss the target. Therefore, I would try again and again making corrections to my aim based on on how far and which direction I was off from the target.
During the training process for a neural network, the weights in each of the neurons are changed slightly to improve the output, or “aim.” The most common approach for making those changes is called backpropagation. Backpropagation is a mathematical approach for applying corrections to every weight in every neuron of the network. During training, input is fed into the network and output is generated. The output is compared to the desired target and the difference between the output and the target is the error. Using the error, backpropagation makes changes to the weights in each neuron to reduce the error. If all goes well during training and backpropagation, the output error diminishes until it reaches expert or better than expert level.
AI vs Humans
In the case of the OpenAI Dota bot that recently beat the world’s best Dota 2 player, the outputs, which were a sequence of steps, strategies and decisions, went from random moves to moves that were so good the bot was able to easily defeat the best pro players in the world. The critical information that enabled the bot to win its matches was stored in the weights of the neurons and the neural network architecture itself.
A good question at this point is to ask if a programmer looking at the Dota 2 bot’s neural network could understand the steps taken by the bot to beat the human player. The answer is no. A programmer can see areas of the neural network that influence an output but it is not possible to explain why the bot took specific steps to formulate its moves and strategies. All the programmer would see is a huge matrix of weights that would be quite overwhelming to interpret.
Another good question to ask is whether or not a program written traditionally by a programmer with step by step instructions could beat the best Dota 2 player. The answer is no. Step by step programs where the programmer specifically instructs the computer to do something would easily be defeated by a professional player. However, a neural network can learn from training things that a programmer would never have the knowledge to program, store that learning in its neurons and use that learning to do things like defeat a human pro.
What makes the Dota 2 bot special is that it learned to beat the best pro players by playing against itself whereas most machine learning programs learn from training on data given to it by a programmer. In machine learning, good training data is like gold. It’s scarce and valuable. (note: This is one reason why Google and other big tech companies want to collect so much data.) Data is used to train neural networks to do useful things like recognizing people and places in your pictures or recognizing your voice from others in your family. OpenAI built a bot that learned almost entirely by playing against itself with the exception of some coaching provided by the OpenAI team. OpenAI has shown clearly that learning can occur without having tons of training data. It’s a little like being able to make gold.
Does the development of the OpenAI dota bot mean bots can now decide to train against themselves and become super bots? No. But it does say that humans can now program two bots to train against each other to become superbots. The key enabler being us. It’s anyone’s guess what type of bot can be imagined and developed in this way, useful or harmful. Obviously to most, a gaming superbot seems pretty innocuous, except of course to the gamer who may unexpectedly run into one during a match. However, it’s not hard to imagine super bots that are not so harmless. Or, perhaps you can just imagine a time when someone trains a bot to play football against itself until the bot becomes better at calling plays and strategy than every coach in the NFL. What happens then? The answer is disruption. Are you ready for it?
AKF Partners recommends that boards and executives direct their teams to identify sources of innovation and patterns of disruption that AI techniques may represent within their respective markets Walmart is already working on facial recognition technology in their stores to determine whether or not shoppers are satisfied at checkout. Will this give them a potential advantage over Amazon? How can machine learning and AI help you prevent fraud in your payment systems or the use of your commerce system to launder money?
AKF is prepared to help answer that question and others you may be facing. We will help you craft your AI strategy, sort through the hype, help you find the opportunities, and identify the potential threats of AI technology to your business.
Reach out to AKF
August 9, 2017 | Posted By: Marty Abbott
We have a saying in AKF Partners that “an incident is a terrible thing to waste”. When things go poorly in a firm, stakeholders (shareholders, partners, employees) pay a price. Having already paid a price, the firm must maximize the learning opportunity the incident presents. Google wasted such a learning opportunity by failing to capitalize on an incredible teaching moment with the termination of James Damore (the author of the sometimes called “Anti-Diversity Manifesto”). While Google seems to have “done the right thing” by firing Damore, it is unclear that they “did it for the right reason”. The “right reason” here is that diversity is valuable to a company because it increases innovation and in so doing increases the probability of success. Further, diversity is hard to achieve, takes great effort and can easily be derailed with very little effort. Companies simply cannot allow employees to work at odds with incredibly valuable diversity initiatives.
Diversity Drives Innovation and Success
My doctoral dissertation journey introduced me to diversity and its beneficial effects on innovation, time to market, and success within technology product firms. Put simply, teams that are intentionally organized to highlight both inherent (traits with which we are born) and acquired (traits we gain from experience) diversity achieve higher levels of innovation. Research published in the Harvard Business Review confirms this, indicating that diverse teams out innovate and out-perform other teams. Diverse teams are more likely to understand the broad base of needs of the market and clients they support. Companies with very diverse management teams are 35% more likely to have financial returns above the mean for their industry. Firms with women on their board on average have a higher ROE and net income than those that do not.
Differences in perspective and skills are things we should all strive to have in our teams. As we point out in The Art of Scalability, these differences increase beneficial cognitive conflict. Increases in cognitive conflict opens a range of strategic possibilities that in turn engender higher levels of success for the firm.
We have for too long allowed the struggle for diversity to be waged on the battleground of “fairness”. The problem with “fair” is that what is “fair’ to one person may seem inherently unfair to another. “Fair” is subjective and “fair” is too often political. “Success” on the other hand is objective and easily measured. Let’s move this fight to where it belongs and embrace diversity because it drives innovation and success. After all, anyone who can’t get behind winning, doesn’t deserve to be on a winning team.
Achieving Diversity is Hard
While the value of diversity is high, the cost to achieve it is also unfortunately high – especially within software teams. As my colleague Robin McGlothin recently wrote, the percentage of computer science degrees awarded to women over the last 25 years is declining. Most other minorities are similarly underrepresented in the field relative to their corresponding representation in the US population.
As in any market with high demand and low supply, companies need to find innovative ways to attract, grow and retain talent. These activities may include special mentoring programs, training programs, or scholarships at local universities meant to attract the group in question. These approaches may seem “unfair” to some, but they are in truth capitalism at its best - the application of market forces to solve a supply and demand problem. When a skill or trait is under high demand and short supply, the cost for that skill goes up. The extra activities above are nothing more than an increased cost to attract and retain the skills we value.
Companies desiring to achieve success in innovation through diversity MUST approach it in a steely, single-minded fashion. Any dissent as it relates to outcomes detracts from the probability of success. How many people with diverse backgrounds will leave or have left Google because of Damore’s missive? How many candidates won’t accept offers? Losing even one great candidate is an unacceptable additional cost given the already high cost to achieve success.
The Bottom Line
Structuring organizations and building cultures that tap the power of inherent and acquired diversity pays huge dividends for firms in terms of innovation, time to market, ROE and net income. While the rewards are high, the cost to achieve these benefits are also high. Success requires a steely, single-minded pursuit of diversity excellence.
The successful company will allow no dissent on this topic, as dissent makes the firm less attractive to the ideal candidate. Given a constrained supply under high demand, the candidate can and should go to the most welcoming environment available.
Put simply, Google did the right thing in firing Damore. But they failed to fully capitalize on the unfortunate event. The right answer, when asked about the reason for firing, would look something like this: “We recognize that diversity in experiences, background, gender and race drives higher levels of innovation and greater levels of success. Our culture will not tolerate employees who are not aligned with creating stakeholder value.”
Interested in driving innovation and time to market in your product and engineering teams? AKF Partners helps companies create experientially diverse product teams aligned with business outcomes to help turbo-charge performance and team innovation.
August 1, 2017 | Posted By: Dave Swenson
We all suffer from various cognitive biases, those mental filters or lenses that alter or warp the reality around us. With the election of 2016, one particular bias has gained widespread attention - the Dunning-Kruger Effect. Defined in wikipedia as:
“...a cognitive bias, wherein persons of low ability suffer from illusory superiority when they mistakenly assess their cognitive ability as greater than it is.”
(If you’ve ever wondered about the behind-the-scenes process of creating Wikipedia content, look at this entertaining discussion.)
In 1999, while a professor at Cornell, David Dunning joined Justin Kruger to co-author a paper titled “Unskilled and Unaware of It: How Difficulties in Recognizing One’s Own Incompetence Lead to Inflated Self-Assessments”, based on studies indicating that people who are incompetent in an area are typically too incompetent to know they are incompetent. Or, simply put, we are often in a position where we don’t know what we don’t know, and therefore cannot judge our level of expertise in a particular area.
This effect or bias, is also known as the ‘Lake Wobegon effect’, or ‘illusory superiority’, and is closely tied to the Peter Principle. Donald Rumsfeld put it this way:
There are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns — the ones we don’t know we don’t know. And if one looks throughout the history of our country and other free countries, it is the latter category that tend to be the difficult ones.
And in John Cleese’s words, stupid people do not have the capability to realize how stupid they are.
The story of how Dunning came to posit the D-K Effect is an amusing one. He read about an unusual bank robbery that occurred in Pittsburgh. What was unusual was that the robber, McArthur Wheeler, made absolutely no effort to disguise himself, and in fact, looked and smiled directly into the security cameras. Yet, he was surprised to quickly be arrested, telling authorities “...but I used the juice!”.
Wheeler told the police that they couldn’t arrest him based on the security videos, as wearing lemon juice, he was of course invisible. He had been told coating your face with lemon juice makes you invisible to cameras, perhaps similar to using lemon juice for invisible ink. Wheeler had even gone as far as to test the theory by taking a Polaroid picture of himself after coating his face with the lemon juice, and sure enough, his face didn’t appear in the print. The police never were able to explain this, but likely Wheeler was as incompetent at photography as he was at burglary. Clearly, Wheeler was too incompetent at burglary to know he was incompetent.
So, does Dunning-Kruger exist in the technology world? Absolutely…
Just as a typical driver believes their driving skills are Formula 1 worthy, until they’re on a track getting blown past by an inferior car driven by someone who has far better braking and cornering skills, we all tend to underestimate what is possible. We live in our own bubbles and are comparing our abilities only against those who also reside in our bubbles. Therefore, we don’t know what we don’t know - we don’t know there are far better drivers outside our bubbles.
You may think your organization is at the peak of efficiency, until you bring someone in from a Google, Facebook, Amazon, etc. who reveals what the true peak really can be - what fully Agile processes and cultures can do to reduce time to market, how effective SREs and DevOps can be, how to remain innovative, what continuous delivery can do to, etc.
AKF firmly believes in “Experiential Diversity” to cross-pollinate teams, injecting new DNA into a company or bubble that was grown in a different bubble. We see numerous companies with very static personnel, where the average employee tenure is over 15 years. There have been tremendous changes in the technology world in 15 years, and while reading a book or attending a conference on new processes brings some exposure to the latest and greatest, it isn’t enough. It is incredibly important to continually bring new blood into an organization, and to purposely tap into that diversity of processes, technologies, organizational structures that comes with the new blood.
Other techniques to mitigate the effect of D-K in technology, of eliminating our personal and organizational biases include:
- 360 degree reviews - Dunning himself has said “The road to self-insight runs through other people”. What better way to get feedback than from periodic 360 degree reviews?
- Code reviews - The likelihood that some percentage of your developers suffer from D-K means that you’re dependent upon code reviews to flush out their incompetence. Just make sure you’re not pairing up two D-K developers to perform the review!
- Planning Poker - requiring, in true Agile fashion, a team to estimate a task or project reduces the chance of that D-K estimate from torpedoing your development planning.
- Soliciting advice - the increasing utilization of open source software means there isn’t a vendor, with hopefully solid expertise, to turn to for advice. Instead of solely relying upon your own developer who only knows how to spell say Cassandra, leverage the appropriate OSS community. Just beware that you might not know whether that solicited advice is good advice.
- Proper interviewing - Ensure your interviewing process can weed out “confident idiots”. Consider planting bogus questions to gauge a candidate’s reactions, like Jimmy Kimmel’s “Lie Witness News”. At a minimum, require team interviewing and consensus for new candidates.
In short, Dunning-Kruger is as rampant within the Technology sector as it is anywhere else, if not even more so. Expect it to be present in your organization, and guard against it. Look at it within yourself as well. Who amongst us hasn’t experienced the shock of discovering we’ve failed a test that we actually thought we’d aced? We all have suffered at one time or another from the Dunning-Kruger effect.
July 19, 2017 | Posted By: Robin McGlothin
We hear every day that more and more jobs are disappearing, yet the technology job sector cannot keep up with the unprecedented demand. So why are women falling behind in this growing career track?
When we look at the percentage of STEM bachelor’s degrees awarded to female students for the last two decades, based on NSF statistics, we find there are no gender difference in the bio sciences, the social sciences, or mathematics, and not much of a difference in the physical sciences. Great news for women scientists. The only STEM fields in which men genuinely outnumber women are computer science and engineering. What? Why the stagnant numbers in computer science?
At the PhD. level, women have clearly achieved equity in the bio sciences and social sciences, are nearly there (40 percent) in mathematics and the physical sciences, and are “over-represented” in psychology (78 percent). More good news. Again, the only fields in which men greatly outnumber women are computer science and engineering. Why no growth?
As I started my research for this blog post, I was pleasantly surprised to find women scientist representation growing in almost all aspects of STEM. And at the same time, disheartened to find my major, computer sciences, is stagnate in growth for women over the past two decades.
What’s different in the computer science & engineering aspects of STEM that seem to hold women back? There are many conflicting reports on how our environment and upbringing are sublimely programming women away from engineering and mathematics. We were told from an early age, math and science are for boys.
My mother was a pioneer and a strong female leader. She holds a PHD in Biochemistry, served as President for Academic Affairs and Provost at Salem International University. She demanded her daughters rise to any challenge and deliver to the best of our abilities. Never once did I doubt I had amazing talents and just needed to get busy using them. So, is it nature or nurture that helped me stay with STEM? Maybe a little of both.
I saw an article recently in the WSJ on Salesforce.com, where CEO Mark Benioff, is focused on ensuring women are represented fairly at every level in his company. Taking proactive steps like SFDC.com, to open doors for women, rings truer to me then the “poor little girl” theories on how to increase female participation in computer science and engineering.
The cloud-computing giant is two years into a companywide “women’s surge” in which managers must consider women when filling open positions at every level. They are also examining salaries for every role in the company to ensure women and men are paid equally. And finally, ensuring that women make up at least 30% of attendees at management summits or onstage roles at keynote presentations.
With some nurturing at home during early years of development and progress in the corporate landscape leveling the playing field, I believe we are finally set to see an upward trajectory for the last two laggard categories in STEM.
Future women engineers can see a world where their hard work and discipline will pay off, a road-map to success if you will. We no longer need to break through the old stereotypes, running faster and jumping higher to be considered half as good as our male counterparts. Instead, there will be fair and equal opportunity for career advancement for women engineers and computer scientists.
I would submit some of the best technology leaders today are women. My personal experience afforded me the opportunity to work with several top female technology executives. One of the best leaders I worked for is a power house that broke all the stereotypes, and worked circles around her male counterparts. As I look back and try to understand what propelled these successful women, they all possess some classic traits that are needed in any leadership role.
Collaboration. Women are skilled collaborators, able to work with all different people. This is an important quality for any professionals, as cross-departmental collaboration is key. Technology impacts every function in modern business, and those most successful will be able to collaborate with all different teams and individuals.
Communication. For many of the same reasons, technologist must also be strong communicators. Communication is an area where many women traditionally excel and it’s an important quality to have. For example, communicating with the sales department may be different from communicating with the IT department. Good technology leaders will be able to speak to everyone.
Perspective. Being able to inspire a team and see the big picture are both equally important. A technology leader must be able to not only collect and analyze data but draw meaningful insights and understand what it means for the company. The ability to holistically view a situation is a competitive differentiation for organizations as well as a positive attribute that many women possess.
In the past, women had to fight a little harder to push through the barriers that have prevented women from entering STEM, but the tide is turning. In today’s new business paradigm, with a strong technology sector jobs forecast, it’s a perfect time for young women to enter computer science and engineering field.
And to help drive this point home, President Donald Trump signed two laws that authorize NASA and the National Science Foundation to encourage women and girls to get into STEM fields. The Inspire Act directs NASA to promote STEM fields to women and girls, and encourage women to pursue careers in aerospace. The law gives NASA three months to present two congressional committees with its plans for getting staff—think astronauts, scientists and engineers—in front of girls studying STEM in elementary and secondary schools. The full name of the law is the Inspiring the Next Space Pioneers, Innovators, Researchers, and Explorers Women Act. The second law is the Promoting Women in Entrepreneurship Act. It authorizes the National Science Foundation to support entrepreneurial programs aimed at women.
The stage has been set – go forth future astronauts, scientist, coder girls! Let’s rock the world.
July 6, 2017 | Posted By: AKF
AKF often recommends to our clients the adoption of business metric monitoring – the use of high-level user activity or transaction patterns that can often provide early warning of an incident. Business metric monitors will not tell you where or what the problem is, rather they tell you something appears to be abnormal and should be investigated. The early warning aspect can help reduce detection time and thus shorten overall MTTR.
At eBay, we had near real time graphs of user metrics such as bids, listings, logins, and new user registrations. The data was graphed week over week. Usage patterns throughout a day followed a readily identifiable pattern with peaks and valleys. These graphs were displayed in the network operations center, which was staffed 24x7. Deviations from the previous week’s pattern had proven useful, identifying issues such as ISP instability in the EU impacting customers trying to access eBay.
Everything seemed normal on a Wednesday evening – right up to the point that bids and listings both took a nose dive. The NOC quickly initiated the SEV1 process and technical resources checked their areas. The site had no identifiable faults, services were confirmed to be working fine, yet the user activity was still markedly lower. Roughly 20 minutes into the SEV1 process, the root cause was identified. The finale episode of American Idol was being broadcast. Our site was fine. Our customers had other things on their mind. The business metric monitors worked – they gave warning of an aberrant usage pattern.
The World Cup is the most popular football (soccer) event in the world, arguably the most popular sporting event worldwide. Broadcast matches draw huge audiences in the UK and the broadcast is typically aired without commercials until half time. There was a documentary on the UK electrical utility system preparing for a broadcast. As soon as half time commenced, a large proportion of the viewing audience visited the loo and hit the lever on their electric tea kettles. Thankfully, the documentary was about the electric utility and not sewage! The step function increase in load would cause significant problems for the utility, straining its ability to maintain voltage and frequency. The utility had prepared for this situation by staging “peakers” – diesel generators that can be brought online to help serve the increased load. Utility grid stability is akin to a Goldilocks Zone – too much is bad, too little is bad, just right is best. The operations center for the utility did not want to bring the generators on too early or too late. They needed real time information on their customers. The solution was to have a TV tuned to the World Cup broadcast in the operations center, enabling the engineers to stage on generators immediately prior to half time and stage them off as the load increase subsided. Being paid to watch the World Cup was certainly an unintended benefit!
How could your company react in a manner like the UK power utility? A sponsored event or viral campaign could overload your systems. Consider using elastic compute in the cloud for your peak demand – the equivalent to the diesel generators use for the World Cup. Scale up for the spikes in demand, then shut it down afterwards. Own the base, rent the peak. Use business metric monitors to detect workload shifts.
‹ First < 5 6 7 8 >