Claret: Using Data Types for Highly Concurrent Distributed Transactions
(Preprint:PaPoC)

Abstract

Out of the many NoSQL databases in use today, some that provide simple data structures for records, such as Redis and MongoDB, are now becoming popular. Building applications using these data structures rather than plain string values provides programmers with a way to communicate intent to the database system without sacrificing flexibility or committing to a fixed schema. Currently, this expressiveness is used to ensure related values are co-located so they can be quickly accessed together (e.g. maps and other aggregates) and to provide complex atomic operations (e.g. set insertion). However, there are many more ways in which data types can be used to make databases more efficient and simpler to use that are not yet being exploited.

In this work, we demonstrate several ways of leveraging data structure semantics in databases, focusing primarily on commutativity. Reasoning about operation reordering can allow transactions to execute concurrently that would conflict under traditional concurrency control. Using Retwis, a Twitter clone built for Redis, as a case study, we show that using commutativity can reduce transaction abort rates for high-contention, update-heavy workloads that arise in real social networks. We conclude that data types are a good abstraction for database records, providing a safe and expressive programming model with ample opportunities for optimization, which will make databases more safe and scalable.