Camil Demetrescu  over 8 years ago

Commit id: 492bd6ae9af1075affd9b7b79ae4cd1d4052ac1f

deletions | additions      

       

% !TEX root = article.tex  \section{Introduction}  \label{se:intro}  The LLVM compiler infrastructure~\cite{lattner2004llvm} provides a Just-In-Time compiler called MCJIT that is currently being used for generating optimized code at run-time in virtual machines for dynamic languages. MCJIT is employed in both industrial and research projects, including Webkit's Javascript engine, the open-source Python implementation Pyston, the Rubinius project for Ruby, Julia for high-performance technical computing, McVM for MATLAB, CXXR for the R language, Terra for Lua, and the Pure functional programming language. The MCJIT compiler shares the same optimization pipeline with static compilers such as {\tt clang}, and it provides dynamic features such as native code loading and linking, as well as a customizable memory manager.  A piece that is currently missing in the environment MCJIT  is a feature to enable on-the-fly transitions between different versions of a running program's function. This feature is commonly known as On-Stack-Replacement (OSR) and is typically used in high-performance virtual machines, such as HotSpot and the Jikes RVM for Java, to interrupt a long-running function and recompile it at a higher optimization level. OSR can be a powerful tool for dynamic languages, for which most effective optimization decisions can typically be made only at run-time, when critical information such as type and shape of objects becomes available. In this scenario, OSR becomes useful also to perform deoptimization, i.e. when the running code has been speculatively optimized and the assumption used for the optimization does not hold anymore, the optimized function is interrupted and the execution continues in a safe version of the code. Currently VM builders using MCJIT are required to have a deep knowledge of the internals of LLVM in order to mimic a transitioning mechanism. In particular, they can rely on two experimental intrinsics, {\em Stackmap} and {\em Patchpoint}, to inspect the details of the compiled code generated by the back-end and to patch it manually with a sequence of assembly instructions. In particular, a Stackmap records the location of live values at a particular instruction address and during the compilation it is emitted into the object code within a designated section; a Patchpoint instead allows to reserve space at an instruction address for run-time patching and can be used to implement an inline caching mechanism~\cite{webkit14}.%~\cite{deutsch1984inlinecaching, webkit14}.  %In a 2013 paper  Lameed and Hendren propose McOSR~\cite{lameed2013modular}, a technique for OSR thatessentially  stores the live values in a global buffer, recompiles the current function function,  and then loads in it the saved state when the execution is resumed. Their approach shows a few limitations that we discusslater on  in this paper, \mysection\ref{se:osr-llvm},  and because of due to  some relevant design choices choices,  it canwork  only work  with the legacy JIT that has been dropped is no longer included in LLVM  sinceLLVM's  release 3.6. \paragraph{Contributions.}  In this paper we investigate general-purpose, target-independent implementations of on-stack replacement. Specific goals of our approach include:         

Our implementation targets two general scenarios: 1) {\em resolved OSR}: \fvariant\ is known before executing \fbase\ as in the deoptimization example discussed above; 2) {\em open OSR}: \fvariant\ is generated when the OSR is fired, supporting deferred and profile-guided compilation strategies. In both cases, \fbase\ is instrumented before its execution to incorporate the OSR machinery. We call such OSR-instrumented version \fosrfrom.  In the resolved OSR scenario (see \myfigure\ref{fi:overview-osr-final}), instrumentation consists of adding a check of the OSR condition and, if it is satisfied, a tail call that fires the OSR. The called function is an instrumented version of \fvariant, which we call \fosrto. The assumption is that \fosrto\ produces the same side-effects and return value that one would obtain by \fbase\ if no OSR was performed. Differently from \fvariant, \fosrto\ takes as input all live variables of \fbase\ at \osrpoint, executes an optional compensation code to fix the computation state ({\tt comp\_code}), and then jumps to a point \textsf{L'} from which execution can continue. The OSR practice often makes the conservative assumption that execution can always continue with the very same program state as the base function. However, this assumption may reduce the number of  %safe  points where OSR transitions can be fired. Supporting compensation code in our framework adds flexibility, allowing OSR transitions to happen at arbitrary places in the base function.  \ifdefined\noauthorea  \begin{figure}[t]  \begin{center} 

\end{figure}  \fi  In the resolved OSR scenario (see \myfigure\ref{fi:overview-osr-final}), instrumentation consists of adding a check of the OSR condition and, if it is satisfied, a tail call that fires the OSR. The called function is an instrumented version of \fvariant, which we call \fosrto. The assumption is that \fosrto\ produces the same side-effects and return value that one would obtain by \fbase\ if no OSR was performed. Differently from \fvariant, \fosrto\ takes as input all live variables of \fbase\ at \osrpoint, executes an optional compensation code to fix the computation state ({\tt comp\_code}), and then jumps to a point \textsf{L'} from which execution can continue. The OSR practice often makes the conservative assumption that execution can always continue with the very same program state as the base function. However, this assumption may reduce the number of  %safe  points where OSR transitions can be fired. Supporting compensation code in our framework adds flexibility, allowing OSR transitions to happen at arbitrary places in the base function.  \ifdefined\noauthorea  \begin{figure}[h!]  \begin{center} 

\end{figure}  \fi  \noindent  The open OSR scenario is similar, with one main difference (see \myfigure\ref{fi:overview-osr-open}): instead of calling \fosrto\ directly, \fosrfrom\ calls a stub function \fstub, which first creates \fosrto\ and then calls it. Function \fosrto\ is generated by a function {\tt gen} that takes the base function \fbase\ and the OSR point \osrpoint\ as input. The reason for having a stub in the open OSR scenario, rather than directly instrumenting \fbase\ with the code generation machinery, is to minimize the extra code injected into \fbase. Indeed, instrumentation may interfere with optimizations, e.g., by increasing register pressure and altering code layout and instruction cache behavior. \paragraph{Discussion.}