Aggregator Pattern Overview

The aggregator design pattern is a service that receives a request, subsequently makes requests of multiple services, combines the results and responds to the initiating request. The diagram pictorially displays this pattern:

Benefits of the Aggregator Pattern

  • Reduces chattiness and communication overhead between the client and services, especially in the case of mobile clients that may be limited in bandwidth or number of connections.
  • Intellectually easy to understand and implement – allows engineers to develop fast time to market solutions at the expense of customer perceived availability
  • Allows for architecturally easily understood endpoint consolidation of discrete functionality.

Drawbacks of the Aggregator Pattern

Put as simply as possible, your solution should avoid the aggregator pattern for a majority of your critical functionality. The pattern is an implementation of the AKF Microservice Anti-Pattern Fan—Out

  • Reduces availability as a result of the multiplicative effect of failure (per article above).
  • Likely increases latency of responses as a result of the anti-pattern above.
  • Increases complexity (cost) of troubleshooting which results in additional time to restore service and repair faults for many incidents.

When to Use the Aggregator Pattern

  • Low Scale Needs: Low transaction rate/low demand/low expected increased rate of demand activities. These are candidates only when the cost to fault isolate components is high relative to the availability increase it enables. Examples may include low return material authorization request services.
  • Low Business Value: Lower business value functions where the pain of suffering outages as a result of fan out anti-pattern usage is low. Examples of low business value solutions may be company executive team bio pages, company stock price/investor relations page, etc.
  • Team Velocity: When a “combined” or monolithic service would require more than one team to implement it, and pain from the increased rate of failure is low (per above). You may for instance decide to implement a set of new functionalities under the aggregator pattern, knowing that if it gets widely adopted later you will need to rearchitect it. Examples here may be a new series of recommendations based on your past purchases, the past purchases of others, and recently viewed items. Because all of them are new, you could call a common recommendation service. Once the services show value to the checkout process, you will likely want to split them up and make them separate calls.
  • Third Party Calls: This is a special exception to our advice not to use the aggregator pattern (or composite service). When Services B and C are remote, third party calls, and the calls to Services B and C are asynchronous to offset the high failure rate of distant third-party calls, the Aggregator Pattern can be useful. Service A is built with circuit breakers and becomes the third-party proxy for all communications. Ideally, calls to Service A from inside the remainder of the architecture is also asynchronous.

Considerations During Usage:

  • Distance and Latency: When using an aggregator, the aggregator should be collocated (in the same data center, or availability zone) as the services of which it is comprised. The one exception to this rule is when it is used to combine third party calls. In this case, all interactions with the aggregator should be asynchronous.
  • Single Logical Point of Failure: Because the aggregator has a relatively high probability of failure, it needs to be more redundant than most solutions. Additional copies, and segmentation of users along the Z axis of the AKF Scale Cube is appropriate to engender higher levels of availability .
  • Install Bulkheads: Given the increased probability of failure, the aggregator and services of which it is comprised must not speak to additional services. Bulkheads https://akfpartners.com/growth-blog/bulkhead-pattern (fault isolation zones or swimlanes) must be put in place between it and other services. Further segmentation along customer boundaries as described above is also appropriate – similarly bulkheaded from each other.
  • Install Circuit Breakers: Because failure and slowness of the services of which the aggregator is comprised will affect the aggregator and as a result both services, the connections to the services should be severable via a circuit breaker. Further, the aggregator should be able to continue to operate without the presence of any of its “children”.
  • Ideally Includes Asynchronous Communications: As earlier indicated, all communication with the aggregator service whether from a client or another service ideally is async in nature.
  • Real Time Monitoring: More than any other atomic service, the aggregator and its children must be monitored in both depth and breadth. Identifying service interruption and cause of interruption is critically important to restoring normal service operation.
  • Gateway vs Composite Implementation: When implemented as a reverse proxy, the Aggregator Pattern is called a Gateway Aggregator Gateway Aggregators inherit all of the concerns of a normal (or “composite tier” aggregator).

When to Avoid the Aggregator Pattern

  • High Scalability Needs: Aggregator services are difficult to scale because they have multiple dimensions. When is the scale need at the point of aggregation and when is it within one of the aggregator’s children? Analyzing this is difficult at best, and most companies with aggregator services find themselves in a constant fight of finding the component that is most critical to scale. If you need extreme scale, stay away from this pattern.
  • High Availability Needs: As we’ve indicated, aggregators always mathematically reduce the availability of a system. As such, if a solution requires incredibly high availability (e.g. checkout, search, file taxes, etc) you should run away from this pattern.
  • Solutions that Comprise Significant Business Value: Do not deploy this pattern for solutions that comprise a majority of value for customers and/or the business, as high failure rates resulting from the fan-out anti-pattern will cause incredibly high levels of pain and will likely result in customer churn.