Newsletter – The Future of Relational Databases
For those readers who haven’t subscribed to our newsletter here is a copy of what was sent out earlier this week. If you would like to receive future emails, about once every other month, sign up here.
We were intrigued by a question asked by Tony Bain of Read Write Web, in his recent article “Is the Relational Database Dead?” Since relational databases management systems (RDBMS) are a significant part of most SaaS or Web 2.0 architectures, we spend a lot of time helping people scale their databases. We also often find ourselves part of discussions about the future of relational databases.
Do Relational Databases Scale?
We believe that RDBMS scale very well especially when not on a single server. This is the major principle behind the three axes of the AKF Scale Cube.
The AKF Database Scale Cube consists of an X, Y and Z axes – each addressing a different approach to scale transactions applied to a database. The lowest left point of the cube (coordinates X=0, Y=0 and Z=0) represents the worst case monolithic database – a case where all data is located in a single location and all accesses go to this single database.
The X Axis of the cube represents spreading read requests across multiple instances of the same data set. All writes get executed on a single database that asynchronously propagates to the read replicas. Caching can be implemented to further reduce database load. The Y Axis of the cube represents a split by function or service. The Z Axis represents ways to split transactions by performing a lookup, a modulus or other indiscriminate function (hash for instance).
Key / Value Stores
We don’t believe in the imminent demise of the RDBMS but we do expect more applications and services to use a relational database as a persistent data store and use key/value stores as part of the transactional system. We have seen and expect to see more of the use of key/value stores such as memcached and redis as a shared memory object cache that is the primary data source for the service or application. Writes work their way back to the relational database asynchronously and reads into the object cache work their way forward either scheduled or on demand.
This trend is possible obviously because of the software being developed to allow this but also the cheaper hardware including virtualization in clouds. As services become more specialized and service level agreements demand faster response times, in memory data stores are very likely to be involved in more architectures as a layer between the application servers and the relational databases.
Another trend in RDBMS is the creation of specialized databases such as Hive that is an open source, peta-byte scale date warehousing framework based on Hadoop that was developed by Facebook. Other databases include Amazon’s distributed, web service SimpleDB, Google’s column-oriented BigTable, and the Apache JSON based CouchDB. Each of these with advantages in particular scenarios. It has been shown in research papers like “One Size Fits All? – Part 2: Benchmarking Results” by Michael Stonebraker and Ugur Cetintemel that major RDBMS vendors can be outperformed by 1 – 2 orders of magnitude by specialized database engines.
In “The End of an Architectural Era” they continue their argument that the current legacy RDBMS code attempting to be a “one size fits all” excels at nothing. In fact they offer data that the completely written from scracth, H-Store database, built in collaboration with MIT, Brown and Yale Universities, can outperform these legacy RDBMS by two orders of magnitude on standard transactional benchmarks.
The demise of the RDBMS has been prophesied almost since it’s inception in E.F. Codd’s 1970 paper “A Relational Model of Data for Large Shared Data Banks“. One of the most promising replacements given the advent of object oriented programming languages was the object-oriented database, which never became mainstream but has resulted in the inclusion of object-relational features in more popular RDBMS.
Even with the improved performance from specialized databases the most popular relational databases including Oracle, MySQL, and PostgreSQL will continue to be an integral part of SaaS and Web 2.0 architectures. Learning how to scale them will continue to be important for many years to come.