this is for holding javascript data
Sankar edited Load on the Database.tex
over 9 years ago
Commit id: 9a17eaaa29a039fe54f9afc4745a8bb23748f7aa
deletions | additions
diff --git a/Load on the Database.tex b/Load on the Database.tex
index 8fe6c52..0dae7c6 100644
--- a/Load on the Database.tex
+++ b/Load on the Database.tex
...
\section{Load on the Database}
The choice of the datastructures and the design of the individual
components of a database system
components depends a lot on the load on the database. In addition to the raw \textbf{Input Output Operations Per Second (IOPS)} estimate, we also need to know the ratio of the type of the I/O requests (\textbf{read or write}).
A generically designed distributed database may actually prove to be inefficient for many usecases which could have better performance, if we design as per the application requirement. To give an example, if If the database will spend more than 99\% of the time on writes (say a logging application), then a Log Structured Merge Tree \cite{O_Neil_1996} may be effective; conversely, if 99\% of the time will be on reads, then memory maps may prove to be more efficient. So, understanding the application need is very important while designing a distributed database. Even while choosing an existing database, having knowledge about the nature of the database workload by the application(s) on top will be useful.
Facebook started the Cassandra\cite{Lakshman_2009} distributed database project initially to perform well during parallel writes and later switched to HBase as they started running more data mining queries on the huge bigdata datasets that they have accumulated.