AKF Partners

Abbott, Keeven & Fisher PartnersPartners In Hyper Growth

Tag » cloud computing

Federated Cloud

In an interesting paper in the IBM Journal of Research and Development, the concept of a federated cloud model is introduced. This model is one in which computing infrastructure providers can join together to create a federated cloud. The advantages pointed out in the article include cost savings due to not over provisioning for spikes in capacity demand. To me the biggest advantage of this federated model is the lack of reliance on a single vendor and likely higher availability due to greater distribution of computing resources across different infrastructure. One of our primary aversions to a complete cloud hosting solution is the reliance on a single vendor for the entire availability of your site. A true federated cloud would eliminate this issue.

However, as the article aptly points out there are many obstacles in the way of achieving such a federated cloud. Not the least of which are technical challenges to architect applications in such a modular manner as to be able to start and stop components in different clouds as demand requires. Other issues include administrative control and monitoring of multiple clouds and security concerns over allowing direct access to hypervisors by other cloud providers.

As we’ve prognosticated, pure VM based clouds like AWS have had to offer dedicated servers for those high intensity IO systems like large relational databases. We’ve also predicted that with double digit growth in cloud services predicted for the next several years, providers will resist the commoditization of their offerings through service differentiation. This attempt at differentiation will come in the form of add-on features and simplification across the entire PDLC. This unfortunately makes the likelihood of a federated cloud offering happening in the next couple of years very unlikely.


2 comments

Enterprise Cloud Computing – Book Review

A topic of particular interest to us is cloud computing so I picked up a copy of Gautum Shroff’s Enterprise Cloud Computing: Technology, Architecture, Applications published in 2010 by Cambridge University Press. Overall I enjoyed the book and thought it covered some great topics but there were a few topics that I wanted the author to cover in more depth.

Enterprise Cloud Computing Book Cover

The publisher states that the book is “intended primarily for practicing software architects who need to assess the impact of such a transformation.” I would recommend this book for architects, engineers, and managers who are not currently well versed with cloud computing. For individuals who already possess a familiarity on these subject this will not be in depth enough nor will it have enough practical advice on when to consider the different applications.

Of minor issue to me is that this book spends a good deal of time upfront covering the evolution of the internet into a cloud computing platform. A bigger issue to me is that coverage of topics is done very well at an academic or theoretical level but doesn’t follow through enough on the practical side. For example, Shroff’s coverage of topics such as MapReduce in Chapter 11 are thorough in describing how the internal functionality but fall short on when, how, or why to actually implement them in an enterprise architecture. In this 13 page chapter, he unfortunately only gives one page to the practical application of batch processing using MapReduce. He revisits this topic in other chapters such as Chapter 16 “Enterprise analytics and search” and does an excellent job explaining how it works but his coverage of the when, how, or why this should be implemented is not given enough attention.

He picks up the practical advice in the final Chapter 18 “Roadmap for enterprise cloud computing”. Here he suggests several ways companies should consider using cloud and Dev 2.0 (Force.com and TCS InstantApps). I would like to have seen this practical side implemented throughout the book.

I really enjoyed Shroff”s coverage of the economics of cloud computing in Chapter 6. He addresses the issue by showing how he compares the in-house (collocation center) vs cloud. Readers can adopt his approach using their own numbers to produce a similar comparison.

The book does a great job covering the fundamentals of enterprise computing, including a technical introduction to enterprise architecture. It will of interest to programmers and software architects who are not yet familiar with these topics. It is suggested by the publisher that this book could serve as a reference for a graduate-level course in software architecture or software engineering, I agree.


Comments Off

The Future of IaaS and PaaS

Even though I’m a fan of technology futurist, I’m not much of a prognosticator myself. However, some recent announcements from Amazon and recent work with some clients got me thinking about the future of Infrastructure as a Service (IaaS) such as Amazon’s AWS and Platform as a Service (PaaS) such as Google’s App Engine or Microsoft’s Azure.

Amazon’s most recent announcement was about Beanstalk. In case you missed it this new service is a combination of existing services, according to their announcement “AWS Elastic Beanstalk is an even easier way for you to quickly deploy and manage applications in the AWS cloud. You simply upload your application, and Elastic Beanstalk automatically handles the deployment details of capacity provisioning, load balancing, auto-scaling, and application health monitoring.” This sounds like a move towards the PaaS to me but the announcement made a point that users retained the ability for total control if desired. It states “…you retain full control over the AWS resources powering your application and can access the underlying resources at any time.”

Werner Vogel, Amazon’s CTO, stated on his blog the need for Beanstalk was in dealing with the complexity involved in managing the entire software stack, which to me id the reason the concept of PaaS was developed. He cites examples already in use of Heroku and Engine Yard for Ruby on Rails, CloudFoundry for Springsource, Acquia for Drupal, and phpfrog for PHP. He states “These platforms take away much of the ‘muck’ of software development to the extent that most RoR developers these days will choose to run on a platform instead of managing the whole stack themselves.” This to me sounds like a blurring of the lines between IaaS and PaaS.

Another item, that we’ve actually written about at the end of last year, is the concept of DevOps. This idea which has gained popularity recently acknowledges the interdependence of development and operations in order to producing timely software products and services. Software developers in many organizations need simpler consolidated platform services in order to procure, deploy, and support virtual instances themselves. This is another push for PaaS platforms but with the flexibility for control when necessary.

Market predictions for cloud services in 2014 span from $55B according to IDC up to $148B according to Gartner. Regardless of the exact number, the trend is double digit growth for many years to come. While the market will pressure for commoditization of these services, providers will resist this through service differentiation. This attempt at differentiation will come in the form of add-on features and simplification across the entire PDLC.

The future of Iaas and PaaS is a blurring of the lines between the two. IaaS providers will offer simpler alternatives while still offering full control and PaaS providers will likely start allowing greater control to attract larger markets. Let us know if you have any thoughts on the future of IaaS or PaaS or both.


1 comment

Outbrain’s CTO on Scalability

I had the opportunity to speak with Ori Lahav, Outbrain’s co-founder and CTO, about his experience scaling Outbrain to handle 1.5 billion page views in just three years.  Outbrain is a content recommendation service that provides for blogs and articles what Netflix does for videos or Amazon does for products.

Ori is oversees the R&D center for the company located in the heart of Israel’s technology center. Prior to founding Outbrain, Ori led the R&D groups at Shopping.com (acquired by eBay) in search and classification. Before joining the internet revolution Ori led the video streaming Server Group at Vsoft.

Below are my notes from the interview.  Here is the link to the audio version but the quality is fairly poor.

What is your background and how did you come to start Outbrain?
Outbrain was founded by Yaron Galai and myself (Navy friendship).  Before that I spent 3.5 years at shopping.com, which was acquired by eBay, leading software groups.  I liked the challenges of a start-up and joining forces with Yaron was a great opportunity.

How has Outbrain grown over the past couple of years?
We did some false starts with several directions but when we started with recommendations on content sites it started catching fire.  We started with self serve blogs, then professional blogs, small publishers, national publishers, and now we are international.  In 3 years we grew from 0 to over 1.5B page views, 3 production servers to over 150, 0 system administrators to 4, and from 3 developers to 15 today.

How has your architecture changed over this period?
Surprisingly… not much, it was first 2 application servers and a MySQL database.  Then we started adding more application servers and replicating the MySQL.  As the product grew, we added more and more components including memcached, Solr, TokyoTyrant, and Cassandra.  We recently added layer that warms the caches and is being notified by ActiveMQ.

On the backend, we have machines that are fetching content from the sites we are on.  We gather article text and images and save them to MogileFS where we have indexers that index them in Solr.
We started investing in reporting infrastructure and started with MySql but with the amount of data we have we shortly understood that Hadoop is the way to go – we now have a cluster of more than 10 nodes in our Hadoop cluster.

What is Outbrains view of scalability?
First – scalability is culture – if you think big – you will be big.  The regular rules of scalability apply here too:

  • No singles
  • Scale out instead of up
  • Replicate data to ease the load
  • Shard data to scale
  • Utilize commodity hardware

Specifically for Outbrain I would add:

  • open-source is crucial for scalability
  • be cost conscious
  • build many simple/small environments and not a single big one

How have you managed data centers to be so cost effective?
We simplify it!  We create many small datacenters and not a big one.  We have no upfront costs of network gear (stacking) and real-estate (Cages).  We use open-source for our network (load balancer and firewalls) and infrastructure (OS).

In order to ease the step function of cost, grow as you go. Our data center provider, Atlantic Metro, has helped with metered power instead of paying by the circuit.  This way we’re motivated to power down servers not needed.

What has been the most difficult scalability challenge?
Scaling the team…we are more of a patrol boat DNA not an aircraft Carrier DNA.  Technology challenges are fun – never too hard but the team and process are much more challenging.

We have also started using the Continuous Deployment process and this has been a great help. By empowering the team members to act and change the system as they need – you can grow to be an Aircraft carrier and still keep the maneuverability of a patrol boat.

What do you think of the NoSQL movement?
We are big fans.  We use Hadoop, Hive, Cassandra, TokyoTyrant, MySql, and others.  This helps us maintain our low serving cost and attract the type of talent we need on the team.
Outbrain is proud to be 100% open-source.  We use open-source from the office telephony system to all platforms and network infrastructure.

And… we are hiring top technology talent in Israel, so contact Outrbain if you are interested.


1 comment

Scaling and Monitoring the Clouds

The usefulness of Amazon’s EC2 cloud took a step forward recently with the introduction of Amazon’s own real-time monitoring, auto scaling, and load balancing product offerings. Most of these were already services offered by third parties built ontop of Amazon’s and other provider’s clouds, such as Mosso, or through custom implementations of HAProxy, etc. However, the integration should allow for easier administration and better support.     

There continue to be reservations by many companies over the feasibility of running critical systems or placing sensitive data on third party clouds. While we have not lost a major cloud computing provider yet, undoubtedly because they are all still so new, other third party storage providers have recently shutdown as noted in PCWorlds article Will Your Data Disappear When Your Online Storage Site Shuts Down?  Granted that these storage providers’ business models were very different, often giving away storage for free in hopes of up selling users on other products such as printing of photos. ISP’s and hosting providers do go out of business all the time, leaving customers in the lurch. Failure is not reserved for small businesses as we’ve seen recently with banks and car companies. As Alan Williamson, co-founder of AW2.0 a cloud computing firm, stated “Users cannot absolve themselves from being 100 percent responsible for their own data.” The cloud computing offerings are becoming more mature but they still require companies to understand the pros and cons in order to make wise decisions and plans in the event of service outages or business failures.


Comments Off

UC Berkeley's take on cloud computing

Researchers at UC Berkeley have outlined their take on cloud computing in an paper “Above the Clouds: A Berkeley View of Cloud Computing.“ They cover a lot of material in this paper and it’s well worth reading.  Section 7 was particularly interesting to us because it covers the top 10 obstacles that companies must overcome in order to utilize the cloud.  According to them these are:

  • Availability of service
  • Data lock-in
  • Data confidentiality and auditability
  • Data transfer bottlenecks
  • Performance unpredictability
  • Scalable storage
  • Bugs in large distributed systems
  • Scaling quickly
  • Reputation fate sharing
  • Software licensing

 

These look very similar to our top five concerns that we outlined in our article on Venturebeat.com.  Our list was:

  • Security
  • Non-portability
  • Control (availability)
  • Limitations (non-persistent storage) 
  • Performance

Their article concludes with “Although Cloud Computing providers may run afoul of the obstacles …we believe that over the long run providers will successfully navigate these challenges…” They continue saying “Hence, developers would be wise to design their next generation of systems to be deployed into Cloud Computing.”  

We agree and reiterate our conclusion from “The Cloud Isn’t For Everyone

“Of course, most importantly, we should all keep an eye on how cloud computing evolves over the coming months and years. This technology has the potential to change the fundamental cost and organization structures of most SaaS companies. And as cloud providers mature, we’re sure they’ll address our top five concerns, becoming more viable companies in their own right.”


Comments Off