Authorea

Camil Demetrescu edited related.tex over 8 years ago

Commit id: 983ff74926d831d6e94056a345d34ed3044373e7

deletions | additions

\section{Related Work} \label{se:related} \paragraph{Early approaches} Approaches.} OSR has been pioneered in the SELF language~\cite{holzle1992self} to enable source-level debugging of optimized code, which required deoptimizing the code back to the original version. To reconstruct the source-level state, the compiler generates {\em scope descriptors} recording for each scope the locations or values of its arguments and locals. Execution can be interrupted only at certain interrupt points (i.e., method prologues and backward branches in loops) where its state is guaranteed to be consistent, allowing optimizations between interrupt points. It is worth mentioning also the {\em deferred compilation} mechanism~\cite{chambers1991self} implemented in SELF for branches that are unlikely to occur at run-time: the system generates a stub which invokes the compiler to generate a code object that can reuse the stack frame of the original code. \paragraph{Java Virtual Machines} Machines.} The success of the Java language has drawn more attention to the design and implementation of OSR techniques, as bytecode interpreters began to work along with JIT compilers. In the high-performance HotSpot Server JVM~\cite{paleczny2001hotspot} performance-critical methods are identified using method-entry and backward-branches counters; when the OSR threshold is reached, the runtime transfers the execution from the interpreter frame to an OSR frame and the compiled code. Deoptimization is performed when class loading invalidates inlining or other optimization decisions: execution is rolled forward to a safe point, at which the native frame is converted into an interpreter frame. Whaley~\cite{whaley2001osr} proposes a technique to identify intra-method code regions that are rarely executed, and thus compile and optimize the code without these regions. A rare block is replaced by a stub that transfers the control to the interpreter, while a glue routine reconstructs the state from the interpreter starting from a table storing the location, in registers or memory, of each variable in the original bytecode.

% TODO only if we have space: HotpathVM \paragraph{LLVM} \paragraph{LLVM.} Prospect~\cite{susskraut2010prospect} is a framework for the parallelization of a sequential application that runs a fast variant and a slow variant of its code. The IR is instrumented statically through two LLVM passes to enable switching between the two variants at run-time; helper methods are used to save and eventually restore registers, while stack-local variables are put on a separate {\tt alloca} stack rather than on the stack frame so that the two variants result into similar and thus interchangeable stack layouts. Speculative variables~\cite{susskraut2009speculation} are introduced when the slow variant needs to track state (e.g., information for out-of-bound checks) that is missing in the fast variant. Switching operations are performed by Prospect at user-specified checkpoints in the original code. McOSR~\cite{lameed2013modular} is a library for LLVM's legacy JIT compiler to insert OSR points at loop headers. When an OSR transition is fired, the live state is saved into a set of global variables (one per live variable) and a helper method is invoked to modify the IR of the function using a code transformer provided by the front-end and a copy of the original code saved as control version. The library generates a new entrypoint for the function to check a global condition and discriminate whether the function is being invoked through an OSR transition or a regular call: in the first case, values for live variables are read from the associated global variables, and the execution jumps to the block to resume the execution at. The SSAUpdater component of LLVM is then used to restore the SSA form after the update. When the helper method returns, the updated function is invoked and the OSR is thus performed in a new stack frame. When the updated function returns, a second helper method is called to recompile the updated function to remove the new entrypoint inserted for the OSR transition, as it can disrupt LLVM optimizations and lead to poorer performance on subsequent invocations of the function. \paragraph{Related work} \paragraph{Other Related Work.} Dynamic Software Updating (DSU) is a methodology for permitting programs to be updated while they run and is thus useful for systems that cannot afford to halt service. DSU techniques (e.g. ~\cite{neamtiu2006dsu,makris2009dsu}) are required to update all functions active on the call stack at the same time, so their code should be instrumented and involved data types wrapped to support future extensions. In tracing JIT compilers deoptimization techniques are used to safely leave an optimized trace when a guard fails and to continue the execution in the interpreter or with a different piece of code. SPUR~\cite{bebenita2010spur} is a trace-based JIT compiler for Microsoft's Common Intermediate Language (CIL) with four levels of JIT-ting: profiling, tracing, optimizing and transfer-tail; transfer-tail JIT is used to bridge the execution from an instruction in a block from tracing or optimizing mode to the next safe point for deoptimization to profiling code. In RPython guards are implemented as a conditional jump to a trampoline that analyzes resume information for the guard and executes compensation code to leave the trace. Resume data is compactly encoded by sharing parts of the data structure between subsequent guards~\cite{schneider2012rpython}; a similar approach is used in LuaJIT, where sparse snapshots are taken to enable state restoration when leaving a trace~\cite{luajit}.