Camil Demetrescu  over 8 years ago

Commit id: 24955a82c4d979efe3305f89140b73d734279a4a

deletions | additions      

       

\subsection{Comparison to previous approaches}  \label{ss:prev-eval-sol}  Lameed and Hendren~\cite{lameed2013feval} proposed two dynamic techniques for optimizing \feval\ instructions in McVM: {\em JIT-based} and {\em OSR-based} specialization. Both attempt to optimize a function $f$ that contains instructions of the form \feval$(g,x,y,z...)$, \feval$(g,...)$,  leveraging information about $g$ and the type of its arguments observed at run-time. The optimization produces a specialized version $f'$ where \feval$(g,x,y,z...)$ instructions are replaced with direct calls of the form $g(x,y,z,...)$. The two approaches differ in the points where code specialization is performed. In JIT-based specialization, $f'$ is generated when $f$ is called. In contrast, the  OSR-based specialization method  interrupts $f$ as it executes, generates a specialized version $f'$, and resumes from it.OSR-based specialization is more general than JIT-based specialization, as the latter only works if the \feval\ argument $g$ is one of the parameters of $f$.   \ifdefined\fullver  \paragraph{JIT-based specialization.}  \paragraph{OSR-based specialization.} This approach triggers an OSR in Another technical difference between the two approaches, which has substantial performance implications, is the representation level at which optimization occurs. When  a function $f$when a loop containing an \feval\ becomes hot. At that time, an optimized version $f'$ of $f$ is generated and control  is diverted first compiled from MATLAB  to $f'$. $f'$ is created IR  by an optimizer that attempts McVM, the functions $g$ it calls via \feval\ are unknown and the type inference engine is unable  to replace an \feval$(g,x,y,z...)$ instruction with a direct call $g(x,y,z,...)$ infer the types of their returned values. Hence, these values must be kept boxed in heap-allocated objects and handled  with unboxed parameters. slow generic instructions in the IR representation of $f$ (suitable for handling different types).  The optimizer JIT method  works at IR level on the IIR representation of $f$  and uses can resort to  the actual full power of type inference to infer the types of the returned  values of $g$ and its arguments (known at $g$, turning  the OSR time) slow generic instructions of $f$ into fast type-specialized instructions in $f'$. On the other hand, OSR-based specialization operates on the IR representation of $f$, which prevents the optimizer from exploting type inference. As a consequence, for $f'$  to specialize be sound,  thecode. The  direct call to $g$ is must be  guarded by a condition that checks if $g$ and the type of its parameters remains remain  the same as observed when at  the OSR time when $f$  was fired. interrupted.  If the guard fails, the code falls back to executing the original \feval\ instruction. This method overcomes the limitations of JIT-based specialization is less general than OSR-based  specialization,supporting optimization of \feval$(g,...)$ calls in functions that do not receive $g$  asa parameter.   \fi  However,  it only works if the \feval\ argument $g$  is substantially slower for two main reasons: one of the parameters of $f$.  \begin{enumerate}  \item %However, OSR-based specialization is substantially slower for two main reasons:  %\begin{enumerate}  %\item  When a function $f$ is first compiled from MATLAB to IR by McVM, the functions it calls via \feval\ are unknown and the type inference engine is unable to infer the types of their returned values. Hence, these values must be kept boxed in heap-allocated objects and handled with slow generic instructions in the IR representation of $f$ (suitable for handling different types). These generic instructions are inherited by the optimized continuation function $f'$.% generated when OSR is fired.  \item $f'$.  %\item  Guard computation in $f'$ can be rather expensive, as it may require checking many parameters. \end{enumerate} %\end{enumerate}  \noindent Our approach combines the flexibility of OSR-based specialization with the efficiency of JIT-based specialization, answering an open question raised by Lameed and Hendren~\cite{lameed2013feval}. Similarly to OSR-based specialization, it does not place restrictions on the functions that can be optimized. On the other hand, it works at IIR (rather than IR) level as in JIT-based specialization, which makes it possible to perform type inference on the specialized code. Working at IIR level eliminates the two main sources of inefficiency of OSR-based specialization: 1) we can replace generic istructions with specialized instructions, and 2) the types of $g$'s arguments do not need to be cached or guarded as they are statically inferred. 

%Furthermore, our solution is cheaper because the types for the other arguments do not need to be cached or guarded: as we will see later on, the type inference engine will compute the most accurate yet sound type information in the analysis of the optimized IIR where direct calls are used.  %%%%%%%%%%%%%%%%%%%%%%%%%%%%  % remaining stuff  \ifdefined\fullver  \paragraph{JIT-based specialization.}   \paragraph{OSR-based specialization.} This approach triggers an OSR in a function $f$ when a loop containing an \feval\ becomes hot. At that time, an optimized version $f'$ of $f$ is generated and control is diverted to $f'$. $f'$ is created by an optimizer that attempts to replace an \feval$(g,x,y,z...)$ instruction with a direct call $g(x,y,z,...)$ with unboxed parameters. The optimizer works at IR level and uses the actual values of $g$ and its arguments (known at the OSR time) to specialize the code. The direct call to $g$ is guarded by a condition that checks if $g$ and the type of its parameters remains the same as observed when the OSR was fired. If the guard fails, the code falls back to executing the original \feval\ instruction.  This method overcomes the limitations of JIT-based specialization, supporting optimization of \feval$(g,...)$ calls in functions that do not receive $g$ as a parameter.   \fi  \ifdefined\fullver  The first one is based on OSR: using the McOSR library~\cite{lameed2013modular}, \feval\ calls inside loops are instrumented with an OSR point and profiling code to cache the last-known types for the arguments of each \feval\ instruction. When an OSR is fired at run-time, a code generator modifies the original function by inserting a guard to choose between a fast path containing a direct call and a slow path with the original \feval\ call. The second technique is less general and uses value-based JIT compilation: when the first argument of an \feval\ call is an argument of the enclosing function, the compiler replaces each call to this function in all of its callers with a call to a special dispatcher. At run-time, the dispatcher evaluates the value of the argument to use for the \feval\ and executes either a previously compiled cached code or generates and JIT-compiles a version of the function optimized for the current value.