Authorea

dcdelia use enumitem in \noauthorea over 8 years ago

Commit id: 44eeaa97e311a83014999968218eff577a1ebc7c

deletions | additions

\usepackage{textcomp} \usepackage{longtable} \usepackage{multirow,booktabs} \usepackage{enumitem} \newcommand{\noauthorea}{} \input{header}

\item An extension for the IIR compiler to track the correspondence between IIR and IR objects at \feval\ sites \item An inserter component to insert OSR points in the IR for IIR locations annotated during the analysis pass \item An optimizer module triggered at OSR points, which in turn is made of: \ifdefined \noauthorea \begin{enumerate}[noitemsep] \else \begin{enumerate} \fi \item A profile-driven IIR generator to replace \feval\ calls with direct calls \item A helper component to lower the optimized IIR function to IR and construct a state mapping \item A code caching mechanism to handle the compilation of the continuation functions \end{enumerate} \end{enumerate} We integrated our analysis pass in McVM's analysis manager. In particular, we group \feval\ instructions whose first argument is reached by the same definition, and for each group we mark for instrumentation only instructions not dominated by others, so that the function can be optimized as early as possible at run-time. The analysis pass is also able to determine whether the value of the argument can change across two executions of the same \feval\ instruction, thus discriminating when a run-time guard must be inserted during the run-time optimization phase.Compared to the OSR-based approach by Lameed and Hendren, our solution is cheaper because the types for the other arguments do not need to be cached or guarded: as we will see later on, the type inference engine will compute the most accurate yet sound type information in the analysis of the optimized IIR where direct calls are used. When the IIR compiler processes an annotated \feval\ instruction, it stores in the metadata of the function version being compiled the current variable map (i.e., a map between IIR and IR objects), the {\tt llvm::BasicBlock*} created for the \feval\ and the {\tt llvm::Value*} object corresponding to the first argument for the \feval. The last two objects are used by the inserter component as source label and {\tt val} argument for inserting an open OSR point. The open-OSR stub will in turn invoke the callback optimizer component we are about to present.

\noindent Our approach combines the flexibility of OSR-based specialization with the efficiency of JIT-based specialization, answering an open question raised by Lameed and Hendren~\cite{lameed2013feval}. Indeed, [...] {\tt [Daniele --> text moved from case-study.tex]} Compared to the OSR-based approach by Lameed and Hendren, our solution is cheaper because the types for the other arguments do not need to be cached or guarded: as we will see later on, the type inference engine will compute the most accurate yet sound type information in the analysis of the optimized IIR where direct calls are used. \ifdefined\fullver The first one is based on OSR: using the McOSR library~\cite{lameed2013modular}, \feval\ calls inside loops are instrumented with an OSR point and profiling code to cache the last-known types for the arguments of each \feval\ instruction. When an OSR is fired at run-time, a code generator modifies the original function by inserting a guard to choose between a fast path containing a direct call and a slow path with the original \feval\ call. The second technique is less general and uses value-based JIT compilation: when the first argument of an \feval\ call is an argument of the enclosing function, the compiler replaces each call to this function in all of its callers with a call to a special dispatcher. At run-time, the dispatcher evaluates the value of the argument to use for the \feval\ and executes either a previously compiled cached code or generates and JIT-compiles a version of the function optimized for the current value.

Lameed and Hendren conclude their paper by stating, ``It would be interesting to look at future work that combine the strengths of both approaches". In the remaining part of this section, we extend McVM by implementing a novel optimization mechanism for \feval\ based on our OSR technique: we will show that our mechanism is as efficient as their JIT-based approach in terms of quality of generated code, and is even more general than their OSR-based approach, as it can optimize also \feval\ calls not enclosed in a loop. \fi