Google Cloud Spanner and the Evolution of the Cloud Data Storage Market
Last week, Google announced the beta launch of Cloud spanner, a new globally distributed relational database service for massively scalable applications. At first glance, we might have been tempted to ignore the release of Cloud Spanner. After all, cloud relational database services are not exactly new. However, Cloud Spanner innovates on several areas that long been the Achilles heel of relational databases.
Cloud Spanner is based on a research paper published by Google in 2012. The project started based on the need to look for alternatives to MYSQL that could address Google’s infinite scalability needs. Cloud Spanner architecture combines the best of the relational and NOSQL worlds by providing a relational model with transactional consistency that can achieve seamless scalability. The database service natively supports aCID transactions without sacrificing its scalability model.
Cloud Spanner has been used in production for a while powering services such as Google Photos. The release is certainly a strong addition to Google Cloud’s impressive data storage service portfolio that can rival market leaders such as AWS and Azure. However, the launch of Cloud Spanner also represents yet another database service that conceptually overlaps with existing offerings such as Google Cloud DataStore which makes it confusing for companies evaluating different cloud storage platforms.
Certainly, the cloud data storage platform market seems to be growing out of control. Just a few years ago, most cloud platforms only included some basic relational and NOSQL capabilities. Today, the PaaS leader offer overwhelming portfolios of cloud storage services that expand across many different categories. To illustrate that point, we can use the following list that includes the main data storage categories present in the current generation of PaaS technologies.
1 — File Systems: This is the most basic form of storage in cloud platforms represented by services such as AWS S3 or Azure Blob Storage.
2 — Relational Database Services: Relational database such as Google Cloud DataStore or SQL Azure have been part of PaaS stacks since the very beginning.
3 — NOSQL Database Services: Most PaaS providers include native NOSQL capabilities with services such as AWS DynamoDB , Google Cloud BigTable or Azure DocumentDB.
4 — Hadoop Services: Support for traditional big data stacks with products such as Hadoop, HBase or HIVE have been another area of focus of PaaS stacks. Technologies such as AWS EMR, Azure HDInsights or Google Cloud Dataproc are relevant examples of this category.
5 — Data Warehouse Services: Massively parallel processing (MPP) data warehouse solutions have become a common citizen of PaaS stacks. Products such as AWS Redshift, Azure Data Warehouse and now Google Cloud Spanner are the key products in this space.
6 — Data Lake Services: While most PaaSs provide the fundamental building blocks to assemble data lake architectures, now they are moving a step further and started offering entire data lake solutions as cloud services. Azure Data Lake Store is the best example of this emergent type of cloud service.
7 — Search Services: Search is another type of storage service present in PaaS technologies. Services such as AWS ElasticSearch or Azure Search are relevant example of this category.
8 — Third Party database: Most PaaS stacks have done a remarkable job providing support for some of the most popular NOSQL and relational databases in the market. Even though this type of storage doesn’t offer the same levels of scalability and management as native cloud services, it is still a relevant category of the cloud storage ecosystem.