Camil Demetrescu edited case-study.tex  over 8 years ago

Commit id: c7f7bb208936d49a957619c21215caa2e196768c

deletions | additions      

       

% !TEX root = article.tex  \section{Case Study} \section{Optimizing \feval\ in McVM}  \label{se:case-study}  In this section we show how \osrkit\ can be used in a production VM to implement aggressive optimizations for dynamic languages. We focus on MATLAB's \feval\ construct, a widely used built-in higher-order function that applies the function passed as first parameter to the remaining arguments (e.g., {\tt feval(g,x,y)} computes {\tt g(x,y)}). This feature is used in many classes of numerical computations that benefit from having functions as parameters. 

\ifdefined\fullver   The number of generated versions per function is on average small (i.e., less than two), as in most cases functions are always called with the same argument types.   \fi  \subsection{Optimizing \feval\ in McVM}  \label{ss:eval-opt-mcvm}  The main idea for optimizing a function $f$ containing an \feval\ instruction is to dynamically generate a variant $f'$ where the \feval$(g,...)$ is replaced by a direct call of the form $g(...)$. The key to efficiency is the ability to perform type inference on the IIR level,   Our approach to \feval\ optimization leverages OSR to generate specialized code where   capture run-time information  after porting it from the LLVM legacy JIT to MCJIT, we have extended it with the following components to enable the optimization of \feval\ instructions:  \begin{enumerate}  \item An analysis pass to identify optimization opportunities for \feval\ instructions in the IIR of a function.  \item An extension for the IIR compiler to track the correspondence between IIR and IR objects at \feval\ sites.  \item An inserter component to insert open OSR points in the IR for IIR locations annotated during the analysis pass.  \item An optimizer triggered at OSR points, which uses:  \begin{enumerate}  \item A profile-driven IIR generator to replace \feval\ calls with direct calls  \item A helper component to lower the optimized IIR function to IR and construct a state mapping   \item A code caching mechanism to handle the compilation of the continuation functions  \end{enumerate}  \end{enumerate}  We integrated our analysis pass in McVM's analysis manager. In particular, we group \feval\ instructions whose first argument is reached by the same definition, and for each group we mark for instrumentation only those instructions not dominated by others, so that the function can be optimized as early as possible at run-time.   \ifdefined \fullver  The analysis pass is also able to determine whether the value of the argument can change across two executions of the same \feval\ instruction, thus discriminating when a run-time guard must be inserted during the run-time optimization phase.  \else  The analysis pass also determines whether the value of the argument can change across two executions of the same \feval, and a run-time guard must thus be inserted during the optimization phase.  \fi  When the IIR compiler processes an annotated \feval\ instruction, it keeps track of the current {\em variable map} between IIR and IR objects, the {\tt llvm::BasicBlock*} created for the \feval, and the {\tt llvm::Value*} object used as its first argument. The last two elements are then used by the inserter component as basic block and {\tt val} argument in the open-OSR stub that invokes the optimizer component.  \newcommand{\gBase}{$g$}  \newcommand{\gOpt}{$g_{opt}$}  \newcommand{\gIIR}{$g^{IIR}$}  \newcommand{\gIR}{$g^{IR}$}  \newcommand{\gOptIIR}{$g^{IIR}_{opt}$}  \newcommand{\gOptIR}{$g^{IR}_{opt}$}  \subsection{Generating Optimized Code}  \label{sse:optimizing_feval}  The core of our optimization pipeline is the optimizer module, which is responsible for generating optimized code for the running function \gBase\ using contextual information passed by an open-OSR stub. As a first step, the optimizer inspects {\tt val} to resolve the target $f$ of the \feval\ and checks whether a previously compiled version of \gBase\ optimized for $f$ is available from the cache.  \ifdefined \fullver  If not, a new function \gOpt\ is generated by cloning the IIR representation \gIIR\ of \gBase\ into \gOptIIR\ and replacing all the \feval\ calls in the same group of the instrumented one with direct calls to $f$.  \else  If not, a new function \gOpt\ is generated by cloning the IIR representation \gIIR\ of \gBase\ into \gOptIIR\ and replacing all \feval\ calls to $f$ with direct calls.  \fi  As a next step, the optimizer asks the IIR compiler to process \gOptIIR\ and generate optimized LLVM IR \gOptIR, also storing the variable map between IIR and IR objects when compiling the direct call corresponding to the \feval\ instruction that triggered the OSR.  \ifdefined \fullver  \fi  This map is essential for the next step, which is constructing a state mapping between \gIR\ to \gOptIR, as it is compared against the corresponding map stored during the lowering of \gBase\ to determine whether for each value in \gOptIR\ live at the continuation block:  \begin{itemize}  \item an {\tt llvm::Value*} from \gIR\ passed as argument at the OSR point can be used directly  \item or, compensation code is required to reconstruct its value before jumping to the block.  \end{itemize}  \noindent In fact, since the type inference engine yields more accurate results for \gOptIIR\ compared to \gIIR, the IIR compiler can in turn generate efficient specialized IR code for representing and manipulating IIR variables, and compensation code is typically required to unbox or downcast some of the live values passed at the OSR point.  \ifdefined \fullver  Compensation code might also be required to materialize an IR object for an IIR variable that were previously accessed through get/set methods from the environment. %TODO  \fi  Once a state mapping object has been constructed, the optimizer calls our OSR library to generate the continuation function for the OSR transition and eventually compiles it. A pointer to the compiled function is stored in the code cache and returned to the stub, which invokes it through an indirect call passing the live state saved at the OSR point.