AKF Partners

Abbott, Keeven & Fisher PartnersPartners In Hyper Growth

Tag » Hadoop

Newsletter – Trends

Below is our most recent newsletter. If you would like to subscribe and have it delivered to your inbox, you can do so here.

Trends
We’re privy to meeting and talking with hundreds of high tech, hyper growth companies and therefore we have a unique opportunity to see trends that are taking place in the industry. Here are a few that we thought would be interesting to share with our friends.

NoSQL Skills – Many companies are experimenting with NoSQL solutions including Membase, Hadoop, Cassandra, and many others but are finding that employees with skills in these are hard to come by. Even in areas rich in talent such as the Silicon Valley, Boston, Austin, Seattle and NY the demand for this rather nascent skill set is higher than the current supply. Many companies are relying on on-the-job training for these types of open source solutions. Most importantly, if you’ve properly fault isolated your architecture you can tolerate the risk associated with small segments going down during beta releases while you iron out the kinks.

Oracle’s NoSQL – Oracle recently announced their entry into NoSQL with announcements around Hadoop integration and their own NoSQL key-value solution. The Hadoop integration isn’t much more than a conduit to allow data to and from an Oracle database and a Hadoop cluster. This technology has been around for quite a while in other relational database solutions such as GreenPlum and Asterdata. It was also recently announced by Microsoft that SQL Server would support this Hadoop connection as well.

Oracle’s NoSQL is a distributed, replicated key-value store and is new and exciting even though it has attributes similar to other product offerings including their acquisition, Berkeley DB. Given trend #1 above, it may sound like an interesting alternative to adopting an enterprise supported NoSQL alternative to open source solutions but beware. The marketing materials for Oracle’s NoSQL solution include the BASE acronym (Basically Available, Soft State, Eventaully Consistent) but unlike other NoSQL products such as Dynamo, SimpleDB, or Cassandra, the Oracle NoSQL Database does not support eventual consistency. Oracle’s solution to eventual consistency appears to be by not accepting writes when the primary node for that key is down.

ORM – many companies start off using an object relational mapping solution such as Hibernate or Active Record but we are seeing many of them having difficulty scaling with them. The solution for several companies has been to use ORMs for simple queries but resort to ODBC or their own Data Access Layer for handling more complex queries. Be wary of using a solution to handle your query development as we’ve had a number of clients with incredibly complex and costly queries bring down their platforms for extended periods of time.

Enterprise Monitoring Frameworks – Until very recently, we had not seen a proprietary third party (non open-sourced) monitoring solution at a customer for at least 2 years and that includes our large Fortune 500 clients. Many of our clients have adopted innovative new monitoring solutions from Wilytech and Coradiant (now CA and BMC respectively) that look for and help diagnose patterns “on the wire” – whether it be from the browser to their servers or from app servers to databases. While these are interesting and potentially worth some of your attention, our very best clients design their systems to be monitored from the ground-up – ensuring that their software helps identify performance problems as they happen.

Distributed File Systems – Take your pick of implementations, but many of our clients are eschewing traditional NAS/NFS devices for distributed storage pools the likes of Gluster, MogileFS and Ceph. Nearly every case has resulted in significant savings relative to proprietary systems with few reported impacts to availability or response times. Of course, as with any other architectural change you need to ensure that you are properly managing your risk through pods or swim lanes.


Comments Off

Google’s Megastore

Papers from the 2011 Conference on Innovative Data Systems Research (CIDR) have been posted and one that is particularly interesting is the Google paper detailing their design and development of Megastore. Megastore is a storage system developed to meet the requirements of today’s interactive online services. According to the paper “Megastore blends the scalability of a NoSQL datastore with the convenience of a traditional RDBMS in a novel way, and provides both strong consistency guarantees and high availability.” The system’s underlying datastore is Google’s Bigtable but it additionally provides for serializable ACID semantics and fine-grained partitions of data.

Here is the link to the paper for you to read all the details yourself but I thought I’d point out a couple things that I found interesting about the design.

Data Split
The Megastore design is what AKF would call a Z-axis split of the data. Google does this because partitioning allows for the synchronous replication of each write across a wide area network (between datacenters) with reasonable latency. The key being the smaller the amount of data the faster the replication. The paper states “…data for most Internet services can be suitably partitioned (e.g., by user) to make this approach viable.”

Joins in Code
While most of our clients are likely to never require extreme scaling on a relational database but if you’re one of the lucky ones, the way to do so is to minimize the use of relational features. This would include things like joins. While joining in the DB is terribly efficient from a coding perspective, by joining in the code you remove load from the DB. You can scale by adding web servers much easier and cheaper than you can add relational databases. Google’s paper states that normalized relational schemas that rely on joins at query time were not the right model for Megastore because high-volume interactive workloads benefit more from predictable performance, reads dominate writes in most web applications so it pays to move work from read time to write time, and key-value stores make querying hierarchical data very simple.

Paxos and Two-Phase Commit
Google’s Megastore utilizes two algorithms that I personally thought would not scale at very large transaction volumes. The first is the Paxos algorithm, which is a way to reach consensus among a group of replicas on a single value. It allows up to F faults with 2F + 1 replicas by essentially voting among the replicas which is notoriously slow. The second algorithm is Two-Phase Commit which allows for atomic updates across entity groups. The paper does admit that these transactions have “much higher latency and increase the risk of contention.” That, in my opinion, is very understated but they do offer the discouragement of applications from using the feature in favor of queues.

I highly recommend you put this paper and some of the others from CIDR on your instapaper list for reading on your next flight or while bored during your next meeting.


Comments Off

Outbrain’s CTO on Scalability

I had the opportunity to speak with Ori Lahav, Outbrain’s co-founder and CTO, about his experience scaling Outbrain to handle 1.5 billion page views in just three years.  Outbrain is a content recommendation service that provides for blogs and articles what Netflix does for videos or Amazon does for products.

Ori is oversees the R&D center for the company located in the heart of Israel’s technology center. Prior to founding Outbrain, Ori led the R&D groups at Shopping.com (acquired by eBay) in search and classification. Before joining the internet revolution Ori led the video streaming Server Group at Vsoft.

Below are my notes from the interview.  Here is the link to the audio version but the quality is fairly poor.

What is your background and how did you come to start Outbrain?
Outbrain was founded by Yaron Galai and myself (Navy friendship).  Before that I spent 3.5 years at shopping.com, which was acquired by eBay, leading software groups.  I liked the challenges of a start-up and joining forces with Yaron was a great opportunity.

How has Outbrain grown over the past couple of years?
We did some false starts with several directions but when we started with recommendations on content sites it started catching fire.  We started with self serve blogs, then professional blogs, small publishers, national publishers, and now we are international.  In 3 years we grew from 0 to over 1.5B page views, 3 production servers to over 150, 0 system administrators to 4, and from 3 developers to 15 today.

How has your architecture changed over this period?
Surprisingly… not much, it was first 2 application servers and a MySQL database.  Then we started adding more application servers and replicating the MySQL.  As the product grew, we added more and more components including memcached, Solr, TokyoTyrant, and Cassandra.  We recently added layer that warms the caches and is being notified by ActiveMQ.

On the backend, we have machines that are fetching content from the sites we are on.  We gather article text and images and save them to MogileFS where we have indexers that index them in Solr.
We started investing in reporting infrastructure and started with MySql but with the amount of data we have we shortly understood that Hadoop is the way to go – we now have a cluster of more than 10 nodes in our Hadoop cluster.

What is Outbrains view of scalability?
First – scalability is culture – if you think big – you will be big.  The regular rules of scalability apply here too:

  • No singles
  • Scale out instead of up
  • Replicate data to ease the load
  • Shard data to scale
  • Utilize commodity hardware

Specifically for Outbrain I would add:

  • open-source is crucial for scalability
  • be cost conscious
  • build many simple/small environments and not a single big one

How have you managed data centers to be so cost effective?
We simplify it!  We create many small datacenters and not a big one.  We have no upfront costs of network gear (stacking) and real-estate (Cages).  We use open-source for our network (load balancer and firewalls) and infrastructure (OS).

In order to ease the step function of cost, grow as you go. Our data center provider, Atlantic Metro, has helped with metered power instead of paying by the circuit.  This way we’re motivated to power down servers not needed.

What has been the most difficult scalability challenge?
Scaling the team…we are more of a patrol boat DNA not an aircraft Carrier DNA.  Technology challenges are fun – never too hard but the team and process are much more challenging.

We have also started using the Continuous Deployment process and this has been a great help. By empowering the team members to act and change the system as they need – you can grow to be an Aircraft carrier and still keep the maneuverability of a patrol boat.

What do you think of the NoSQL movement?
We are big fans.  We use Hadoop, Hive, Cassandra, TokyoTyrant, MySql, and others.  This helps us maintain our low serving cost and attract the type of talent we need on the team.
Outbrain is proud to be 100% open-source.  We use open-source from the office telephony system to all platforms and network infrastructure.

And… we are hiring top technology talent in Israel, so contact Outrbain if you are interested.


1 comment