Authorea

Bhathiya edited Section_Apache_Derby_Architecture_Before__.tex about 8 years ago

Commit id: 9fab7198934c88ae106f1bb235450f0b2cbe8328

deletions | additions

Before going in to the details of Derby Optimizer lets look at the Apache Derby architecture. If we consider the module view of Derby architecture it is a system comprised of a monitor and a collection of modules. The monitor is code that maps module requests to implementations based upon the request and the environment. For example with JDK 1.3 the internal request for a JDBC driver the monitor selects Derby’s JDBC 2.0 implementation, while in JDK 1.4 the driver is the JDBC 3.0 implementation. This allows Derby to present a single JDBC driver to the application regardless of JDK and internally the correct driver is loaded. A module in Derby is a set of discrete functionality, such as a lock manager, JDBC driver, indexing method etc. A module’s interface is typically defined by a set of Java interfaces. For example the java.sql interfaces define a interface for a JDBC driver. All callers of a module do so purely through its interface to separate api from implementation. A module’s implementation is a set of classes that implement the required behavior and interfaces. Thus a module implementation can change or be replaced with a different implementation without affecting the callers’ code. Modules are either system wide (shared) or per-service with a service corresponding to a database.

The generated statement plan is cached and can be shared by multiple connections. DDL statements (e.g. CREATE TABLE) use a common statement plan to avoid generation of a Java class file. This implementation was driven by the original goal to have a small footprint. Using the JVM’s interepter was thought to be less code than having an internal one. It does mean that the first couple of times the statement plan is executed,it would be interpreted. After a number of executions, the Just-In-Time (JIT) compiler will decide to compile it into native code. Thus, running performance tests will see a boost after a number of iterations. In addition, calling into Java user-supplied methods (functions and procedures) is direct, rather than through reflection. SQL Execution is calling execute methods on the instance of the generated class that return a result set object. This result set is a Derby ResultSet class, not a JDBC one. The JDBC layer presents the Derby ResultSet as a JDBC one to the application. For a simple table scan the query would consist of a single result set object representing the table scan. For a more complex query the top-level result set ”hides” a tree of result sets that correspond to the correct query. DML (INSERT/ UPDATE/ DELETE) are handled the same way, with a ResultSet that performs all of its work in its open method and returns an update count. These result set objects interface with the Store layer to fetch rows from tables, indexes or perform sorts. The Store layer of Derby Architecture splits into two main areas, access and raw. The access layer presents a conglomerate (table or index)/row based interface to the SQL layer. It handles table scans, index scans, index lookups, indexing, sorting, locking policies, transactions, isolation levels. The access layer sits on top of the raw store which provides the raw storage of rows in pages in files, transaction logging, transaction management. JCE encryption is plugged in here at the page level. The raw store works with a pluggable file system api that allows the data files to be stored in the Java filesystem, jar files, jar files in the classpath, or any other mechanism.