I had the opportunity to speak with Ori Lahav, Outbrain’s co-founder and CTO, about his experience scaling Outbrain to handle 1.5 billion page views in just three years. Outbrain is a content recommendation service that provides for blogs and articles what Netflix does for videos or Amazon does for products.
Ori is oversees the R&D center for the company located in the heart of Israel’s technology center. Prior to founding Outbrain, Ori led the R&D groups at Shopping.com (acquired by eBay) in search and classification. Before joining the internet revolution Ori led the video streaming Server Group at Vsoft.
Below are my notes from the interview. Here is the link to the audio version but the quality is fairly poor.
What is your background and how did you come to start Outbrain?
Outbrain was founded by Yaron Galai and myself (Navy friendship). Before that I spent 3.5 years at shopping.com, which was acquired by eBay, leading software groups. I liked the challenges of a start-up and joining forces with Yaron was a great opportunity.
How has Outbrain grown over the past couple of years?
We did some false starts with several directions but when we started with recommendations on content sites it started catching fire. We started with self serve blogs, then professional blogs, small publishers, national publishers, and now we are international. In 3 years we grew from 0 to over 1.5B page views, 3 production servers to over 150, 0 system administrators to 4, and from 3 developers to 15 today.
How has your architecture changed over this period?
Surprisingly… not much, it was first 2 application servers and a MySQL database. Then we started adding more application servers and replicating the MySQL. As the product grew, we added more and more components including memcached, Solr, TokyoTyrant, and Cassandra. We recently added layer that warms the caches and is being notified by ActiveMQ.
On the backend, we have machines that are fetching content from the sites we are on. We gather article text and images and save them to MogileFS where we have indexers that index them in Solr.
We started investing in reporting infrastructure and started with MySql but with the amount of data we have we shortly understood that Hadoop is the way to go – we now have a cluster of more than 10 nodes in our Hadoop cluster.
What is Outbrains view of scalability?
First – scalability is culture – if you think big – you will be big. The regular rules of scalability apply here too:
- No singles
- Scale out instead of up
- Replicate data to ease the load
- Shard data to scale
- Utilize commodity hardware
Specifically for Outbrain I would add:
- open-source is crucial for scalability
- be cost conscious
- build many simple/small environments and not a single big one
How have you managed data centers to be so cost effective?
We simplify it! We create many small datacenters and not a big one. We have no upfront costs of network gear (stacking) and real-estate (Cages). We use open-source for our network (load balancer and firewalls) and infrastructure (OS).
In order to ease the step function of cost, grow as you go. Our data center provider, Atlantic Metro, has helped with metered power instead of paying by the circuit. This way we’re motivated to power down servers not needed.
What has been the most difficult scalability challenge?
Scaling the team…we are more of a patrol boat DNA not an aircraft Carrier DNA. Technology challenges are fun – never too hard but the team and process are much more challenging.
We have also started using the Continuous Deployment process and this has been a great help. By empowering the team members to act and change the system as they need – you can grow to be an Aircraft carrier and still keep the maneuverability of a patrol boat.
What do you think of the NoSQL movement?
We are big fans. We use Hadoop, Hive, Cassandra, TokyoTyrant, MySql, and others. This helps us maintain our low serving cost and attract the type of talent we need on the team.
Outbrain is proud to be 100% open-source. We use open-source from the office telephony system to all platforms and network infrastructure.
And… we are hiring top technology talent in Israel, so contact Outrbain if you are interested.