Authorea

Paf Paris edited untitled.tex over 8 years ago

Commit id: d16e924e56d29e16cfd896a8a7e7e10de9c01bc4

deletions | additions

\section{Related Work} \textbf{C-JDBC} \cite{cjdbc}, is an open source database cluster middleware, which provides a Java application access to a cluster of databases transparently through JDBC. The database can be distributed and replicated among several nodes. C-JDBC balances the queries among these nodes. C-JDBC also handles node failures and provides support for check-pointing and hot recovery. Like Hihooi, it is compatible with any database engine that provides a JDBC driver and does not require any changes to the database engine to work. It consists of a generic JDBC driver, used by client applications and by a controller that handles load balancing and fault tolerance. The C-JDBC controller handles the distribution and acts as a proxy between the C-JDBC driver and the database backends. As in Hihooi, the replicas are hidden from the application. The C-JDBC controller exposes a single virtual database and enables database back-ends to be added or removed dyanamically and transparently. It also offers early responses to updates, where the controller returns the result as soon as one, the majority or all backends have executed the operation. \section{Experimental Evaluation} \label{sec:evaluation} Unlike Hihooi, where the extension databases send their results directly to the client application, in C-JDBC results are passing through the controller where they are serialized and send back through the communication channel. Evaluation To evaluate our model we chose the TPC-E benchmark \cite{tpce}, an OLTP oriented workload designed by the Transaction Processing Performance Council. The workload includes several OLTP queries of variable complexity, as well as different processing and memory demands, representing similar to real workload characteristics. (...more on TPC-E...) The TPC-E scaling parameters were chosen as follows: 1000 customers, 1 working day of populated transactions and a scale–factor of 500. Finally, \subsection{System Setup} For the experiments, a group of machines was used to host the different entities of Hihooi. For each component (Manager, Listener, Primary DB, and extension DBs) a dedicated machine was used. All machines shared the same configuration (m4.large) and were deployed in AWS in a local LAN. Before starting any experiment, all databases were always reset to an initial condition, to ensure that every experiment started from the same, constant state. During the experiments, all transactions involvig a \textit{write} were executed within a \emphasis{START TRANSACTION} - \emphasis{COMMIT} BLOCK. In \cite{no_lazy}, it is discussed \subsection{Workload Mix}\label{wmix} The TPC-E benchmark has a fixed workload of reads and writes that if the computing environment allows it, eager replication should be used do not fit our demonstration goals. For this reasons we created three different combinations of workloads having different read-write ratios. The ratios chosen are 0\%, 5\%, 10\%, and demonstrates a way to implement it 30\% writes in practice. the total workload. \textbf{Postgres-R} One \subsection{Part 1. Performace and Scalability} The first part ofPostgres-R's shortcomings is that it has a rather intrusive implementations, requiring modifications to the underlying database, something that is not always feasible evaluation analyzes performance and limits database heterogeneity. scalability. Hihooi was compared to a reference system consisting of a single PostgreSQL instance. We measured the performance of Hihooi in different configurations, from 1 to 8 extension DBs. Each setup was tested with 3 different workload mixes, as mentioned in \ref{wmix}. \textbf{Improving To measure the Scalability \textit{speedup} we kept the workload fixed and increased the number of Fault-Tolerant Database Clusters} The model proposed in XXXthisXXX is focused on a category extension DBs. To measure the \textit{scaleup} we increased the number of applications that manifest requests that can be divided into disjoint categories. replicas while we proportionaly increased the workload. \textbf{Oracle Golden Gate} {Figures} [transactions per second] [latency (msec)] Log-based replication For Primary and Extension DBs [cpu utilization [\% Vs time]] - underutilized ?? [Network I [Bytes Vs time]] - balanced or not ?? [Network O [Bytes Vs time]] - balanced or not ?? how are the previous affected by the workload mix (read/write ratio) ?? Listener and Manager (Vs Tester = actual requests) Network traffic characteristic