The dawn of the internet age knew little of database scalability. One size fitted everyone. Yet, the boom in data spurred by consumption of mega web applications like Facebook, Google, and the like, have transformed the way we look at databases today. With good marketing and getting the right people on board, even the most absurd ideas can become popular, and when it comes to scalability there is a lot of hype but little substantial dialectic. This article serves to give a basic overview of the theory behind database scalability and hopefully give a bigger perspective of what is involved.
A service is said to be scalable if, when we increase the resources in a system, it results in increased performance in a manner proportional to resources added.
Browne notes that, “[a]ny system can scale given enough time and money.” The question he then raises is what is the easiest route to scalability. There are three basic areas that can be improved to achieve scalability,
- Resource distribution
- Database architecture
The first is the most obvious route and the simplest to implement, though beyond basic configurations like caching it requires capital to improve on and has a threshold. The second requires the combination of hardware and database architecture, and the skill involved with the third requires significant human capital.
Resource distribution is perhaps the most effective way to scale databases, but is the most difficult. Not only does it require the knowledge and expertise to setup and maintain the hardware, but it requires advanced techniques in database architecture to make use of it. It is here that Brewer’s CAP Theorem comes into play.
Architecture is the most debated and misunderstood topic in the database community today. The noSQL movement has enchanted everyone through youth, charm, and a whole lot of misinformed publicity. While there is much to be said for schemaless and non-relational database models, the 30 years and running SQL movement has its place. It is well to state that SQL and noSQL are really just two sides of the same coin. MongoDB precisely states, “[d]atabases are specializing – the ‘one size fits all’ approach no longer applies.”