Authorea

Brandon Holt reorder some stuff about 9 years ago

Commit id: f32a14bb1380d4e1365ef1ea806ca8ee45103ea3

deletions | additions

'read_heavy'='90% read\n10% update' ))) d$dist <- factor(revalue(d$alpha, c('0.6'='Zipf: 0.6', '-1'='Uniform'))) d.u <- subset(d, nshards == 4 & nkeys == 10000 & (alpha == '0.6' | alpha == '-1') & grepl('update_heavy|read_heavy', mix)) mix) ) ggplot(d.u, aes(x=nclients, y=throughput/1000, group=cc, fill=cc, color=cc, linetype=cc))+ stat_summary(fun.y=max, geom="line", size=0.4)+ xlab('Concurrent clients')+ylab('Throughput (k/sec)')+ expand_limits(y=0)+ facet_grid(dist~opmix)+ theme_mine+theme(legend.position='right', theme_mine+ theme(legend.position='right', legend.direction='vertical', legend.title.align=0)+ cc_scales(title='Concurrency\ncontrol:') ``` \begin{figure}[t]

factors=c('nshards', 'nclients'), numeric=c('total_time', 'txn_count') )) d.u <- subset(d, nshards == 4 & initusers == 4096 & grepl('geom_repost|read_heavy', mix)) ggplot(d.u, aes(x=nclients, y=throughput/1000, group=cc, fill=cc, color=cc, linetype=cc))+ stat_summary(fun.y=mean, geom="line", size=0.4)+ xlab('Concurrent clients')+ylab('Throughput (k trans. / sec)')+ expand_limits(y=0)+ facet_wrap(~workload)+ theme_mine+theme(legend.position='right', theme_mine+ theme(legend.position='right', legend.direction='vertical', legend.title.align=0)+ cc_scales(title='Concurrency\ncontrol:') ``` \begin{figure}[t]

# Appendix ## Transaction protocol {#apx:protocol} Our protocol for executing transactions is fairly straightforward: two-phase commit, Many ideas and uses two-phase locking with retries to guarantee isolation. We employ a number of standard optimizations, such as delaying acquiring locks for operations that don't return a value to the *prepare* step so that locks are held for as short a time as possible. However, there is one step that is non-standard in order to support complex data types where rolling back state changes would be non-trivial. To support transactions with arbitrary data structure operations, each operation is split details did not fit intotwo steps: *stage* and *apply*. During transaction execution, each operation's *stage* method attempts to acquire the necessary lock and core of the paper, but because they may return help interpret our findings and give more of a value *as if the operation has completed* (e.g. an "increment" speculatively returns the incremented value). When the transaction is prepared to commit, *apply* is called on each staged operation to actually mutate the underlying data structure. This allows operations to easily be un-staged if the transaction fails to acquire all the necessary locks, without requiring rollbacks. sense of future directions, we include them here for anyone interested. ## Other opportunities for commutativity

**Combining.** Another spin on associative operations is to merge or *combine* operations as they come in. This is known as combining [@flatCombining,yew:combining-trees,funnels], and it can drastically reduce contention. Combining can be done hierarchically: first with a few neighbors, then with clusters of neighbors, finally the combined operation is applied to the shared data structure. ## Transaction protocol {#apx:protocol} Our protocol for executing transactions is fairly straightforward: two-phase commit, and uses two-phase locking with retries to guarantee isolation. We employ a number of standard optimizations, such as delaying acquiring locks for operations that don't return a value to the *prepare* step so that locks are held for as short a time as possible. However, there is one step that is non-standard in order to support complex data types where rolling back state changes would be non-trivial. To support transactions with arbitrary data structure operations, each operation is split into two steps: *stage* and *apply*. During transaction execution, each operation's *stage* method attempts to acquire the necessary lock and may return a value *as if the operation has completed* (e.g. an "increment" speculatively returns the incremented value). When the transaction is prepared to commit, *apply* is called on each staged operation to actually mutate the underlying data structure. This allows operations to easily be un-staged if the transaction fails to acquire all the necessary locks, without requiring rollbacks. ## Retwis workload {#apx:retwis} ```{r} df <- subset(db("select subset(db(" select * from tapir where stat_following_counts is not null and name like '%v0.14%'"), '%v0.14%' "), nclients == 32 & initusers == 4096 )

``` ```{r followers, include=F} d.follow <- histogram.facets(subset(df, initusers == 4096 & mix == 'geom_repost'), 'geom_repost' ), 'stat_follower_counts', 'grp') ggplot(d.follow, aes(x=x, weight=y))+ stat_ecdf(color=c.blue)+ xlab('# followers / user (log scale)')+ylab('CDF (log scale)')+

```{r reposts, include=F} d.repost <- histogram.facets(subset(df, histogram.facets( subset(df, initusers == 4096 & mix == 'geom_repost'), 'geom_repost') , 'stat_repost_counts', 'grp') ggplot(d.repost, aes(x=x, weight=y))+ stat_ecdf(color=c.blue)+ xlab('# reposts')+ylab('count')+ scale_x_log10(breaks=c(1,10,100,1000))+scale_y_log10(breaks=c(0.1,0.2,0.4,0.6,0.8,1.0))+ scale_x_log10(breaks=c(1,10,100,1000))+ scale_y_log10(breaks=c(0.1,0.2,0.4,0.6,0.8,1.0))+ xlab('# reposts (log scale)')+ylab('CDF (log scale)')+ theme_mine ```