Hey, it's HighScalability time:
Google I/O to world: Just try to keep up with us. You can't. But go ahead and try.na na na nah...
17 billion : Google Cloud Messaging messages per day with 60ms latency; 1B page views : 500px; 121 billion : edge graph using ; 4 billion hours : hours watched on per quarter; 4.5 trillion : BigTable transactions per month
As Spanner is a not so distant cousin of BigTable, the earlier 2011 talk given by Alex at the HotStorage conference, the reason for embracing OldSQL was the desire to make it easier and faster for programmers to build applications. The main ideas will seem quite familiar: …component should be no surprise. Spanner is charged with spanning millions of machines inside any number of geographically distributed datacenters. What is surprising is how OldSQL has been embraced. In an
See the website .
See the slides .
He doesn't cover setting up a production cluster.
Using a schema is optional.
is like a combination of from and BigTable from Google.
It uses timestamps for conflict resolution. The clients determine the time. There are other approaches to conflict resolution as well.
Data in Cassandra looks like a multi-level dict.
By default, Cassandra eats 1/2 of your RAM. You might want to change that ;)
…you get is a nice database engine for certain type of workloads. In fact,'s BigTable, 's , and amongst others are all using a variant or a direct copy of this very architecture.
Simple on the surface, but as usual, implementation details matter a great deal. Thankfully, Jeff Dean and Sanjay Ghemawat , the original contributors to the and BigTable infrastructure at Google released , … earlier last year
…can only assumeeither lets the last write win or has a scheme similar to BigTable, using timestamps for each attribute.
Writes don't allow you to specify something like a quorum, telling DynamoDB how consistent you'd like the write to be, it seems to be up to the system to decide when and how quickly replication to other datacenters is done. Alex Popescu's summary on DynamoDB and Werner Vogels' introduction suggest that writes are replicated …
…around this, even with a range-based key location. HBase (and Google's BigTable, for that matter) stores ranges of data in separate tablets. As tablets grow beyond their maximum size, they're split up and the remaining parts re-distributed. The advantage of this is that the original range is kept, even as you scale up.
Consistent Hashing Enables Partitioning
When you have a consistent hash, everything looks like a partition. The idea is simple. Consistent hashing forms a keyspace, …
such as MongoDB ., BigTable, , , and
Recently, MongoDB has received a lot of attention due to the following factors:
availability on many platforms
rich language support: C,, C#, , , , , , Ruby
binaryfor efficient storage
equivalent of …
…and 's broad use of their BigTable and systems. We evaluated all the usual open source suspects. After considerable debate, we decided to go with .
We don't deserve anything.can do whatever they want. If you don't like it, don't send them nasty emails or browse their sites with ad-blockers: just don't support them. Don't read their content, don't link to them, and …
…engineer Stu Hood, who explained Cassandra's appeal: "Over the Bigtable clones, Cassandra has huge high-availability advantages, and no single point of failure. When compared to the Dynamo adherents, Cassandra has the advantage of a more advanced datamodel, allowing for a single row to contain billions of column/value pairs: enough to fill a machine. You also get efficient range queries for the top level key, and even within your values."
…you've got a bunch of other data models in the middle: graph databases, tabular databases like BigTable, and document databases like Mongo.
There are a few different ways that you can think about document databases. One of the nice things about document databases is that they're closely mapped to how most developers are writing code, whereas SQL databases were designed for accounting and banking 30 or 40 years ago, prior to the advent of web applications and the rise of object-oriented …