Category Archives: databases

Google’s lesson learned: distributed transactions matter

In retrospect I think that [not supporting distributed transactions] was a mistake. We probably should have added that in because what ended up happening is a lot of people did want distributed transactions, and so they hand-rolled their own protocols, sometimes incorrectly, and it would have been better to build it into the infrastructure. So in Spanner we do have distributed transactions. We don’t have a lot of experience with it yet.

–Jeff Dean, in response to “What was the biggest mistake?” at the end of his Stanford EE380 lecture (note: video has since been replaced with that of another lecture)

Infrequently asked questions on deterministic distributed transaction management

Updated since my original post: added plug; explained why centralization is still hard; noted the issues unaddressed by this work; clarified that Daniel is focusing on ACID.

My homeboy Daniel Abadi baits flames in true Stonebraker style in his latest blog post, proclaiming: “the NoSQL decision to give up on ACID is the lazy solution to these scalability and replication issues.” Then he shows you how he promises to pimp your database by imposing a deterministic serial ordering to all transactions.

Without further ado, it’s time for some Infrequently Asked Questions On Deterministic Distributed Transaction Management!

I’m busy. Gimme the executive summary.

Daniel introduces his new VLDB paper, where he says he’s got the distributed DBMS your distributed DBMS could smell like, if only you used a centralized, deterministic transaction manager instead of lady-scented body wash. Specifically, he proposes avoiding the dreaded two-phase commit by centrally controlling concurrency, and minimizing lock time with optimistic operation.

I react.

Continue reading