deletions | additions
diff --git a/prev-eval-sol.tex b/prev-eval-sol.tex
index a5f3c29..68f1379 100644
--- a/prev-eval-sol.tex
+++ b/prev-eval-sol.tex
...
\subsection{Comparison to previous approaches}
\label{ss:prev-eval-sol}
Lameed and Hendren~\cite{lameed2013feval} proposed two dynamic techniques for optimizing \feval\ instructions in McVM: {\em JIT-based} and {\em OSR-based} specialization. Both attempt to optimize a function $f$ that contains instructions of the form
\feval$(g,x,y,z...)$, \feval$(g,...)$, leveraging information about $g$ and the type of its arguments observed at run-time. The optimization produces a specialized version $f'$ where \feval$(g,x,y,z...)$ instructions are replaced with direct calls of the form $g(x,y,z,...)$. The two approaches differ in the points where code specialization is performed.
In JIT-based specialization, $f'$ is generated when $f$ is called. In contrast,
the OSR-based
specialization method interrupts $f$ as it executes, generates a specialized version $f'$, and resumes from it.
OSR-based specialization is more general than JIT-based specialization, as the latter only works if the \feval\ argument $g$ is one of the parameters of $f$.
\ifdefined\fullver
\paragraph{JIT-based specialization.}
\paragraph{OSR-based specialization.} This approach triggers an OSR in Another technical difference between the two approaches, which has substantial performance implications, is the representation level at which optimization occurs. When a function $f$
when a loop containing an \feval\ becomes hot. At that time, an optimized version $f'$ of $f$ is generated and control is
diverted first compiled from MATLAB to
$f'$. $f'$ is created IR by
an optimizer that attempts McVM, the functions $g$ it calls via \feval\ are unknown and the type inference engine is unable to
replace an \feval$(g,x,y,z...)$ instruction with a direct call $g(x,y,z,...)$ infer the types of their returned values. Hence, these values must be kept boxed in heap-allocated objects and handled with
unboxed parameters. slow generic instructions in the IR representation of $f$ (suitable for handling different types). The
optimizer JIT method works
at IR level on the IIR representation of $f$ and
uses can resort to the
actual full power of type inference to infer the types of the returned values of
$g$ and its arguments (known at $g$, turning the
OSR time) slow generic instructions of $f$ into fast type-specialized instructions in $f'$. On the other hand, OSR-based specialization operates on the IR representation of $f$, which prevents the optimizer from exploting type inference. As a consequence, for $f'$ to
specialize be sound, the
code. The direct call to $g$
is must be guarded by a condition that checks if $g$ and the type of its parameters
remains remain the same as observed
when at the
OSR time when $f$ was
fired. interrupted. If the guard fails, the code falls back to executing the original \feval\ instruction.
This method overcomes the limitations of JIT-based
specialization is less general than OSR-based specialization,
supporting optimization of \feval$(g,...)$ calls in functions that do not receive $g$ as
a parameter.
\fi
However, it
only works if the \feval\ argument $g$ is
substantially slower for two main reasons: one of the parameters of $f$.
\begin{enumerate}
\item %However, OSR-based specialization is substantially slower for two main reasons:
%\begin{enumerate}
%\item When a function $f$ is first compiled from MATLAB to IR by McVM, the functions it calls via \feval\ are unknown and the type inference engine is unable to infer the types of their returned values. Hence, these values must be kept boxed in heap-allocated objects and handled with slow generic instructions in the IR representation of $f$ (suitable for handling different types). These generic instructions are inherited by the optimized continuation function
$f'$.% generated when OSR is fired.
\item $f'$.
%\item Guard computation in $f'$ can be rather expensive, as it may require checking many parameters.
\end{enumerate} %\end{enumerate}
\noindent Our approach combines the flexibility of OSR-based specialization with the efficiency of JIT-based specialization, answering an open question raised by Lameed and Hendren~\cite{lameed2013feval}. Similarly to OSR-based specialization, it does not place restrictions on the functions that can be optimized. On the other hand, it works at IIR (rather than IR) level as in JIT-based specialization, which makes it possible to perform type inference on the specialized code. Working at IIR level eliminates the two main sources of inefficiency of OSR-based specialization: 1) we can replace generic istructions with specialized instructions, and 2) the types of $g$'s arguments do not need to be cached or guarded as they are statically inferred.
...
%Furthermore, our solution is cheaper because the types for the other arguments do not need to be cached or guarded: as we will see later on, the type inference engine will compute the most accurate yet sound type information in the analysis of the optimized IIR where direct calls are used.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% remaining stuff
\ifdefined\fullver
\paragraph{JIT-based specialization.}
\paragraph{OSR-based specialization.} This approach triggers an OSR in a function $f$ when a loop containing an \feval\ becomes hot. At that time, an optimized version $f'$ of $f$ is generated and control is diverted to $f'$. $f'$ is created by an optimizer that attempts to replace an \feval$(g,x,y,z...)$ instruction with a direct call $g(x,y,z,...)$ with unboxed parameters. The optimizer works at IR level and uses the actual values of $g$ and its arguments (known at the OSR time) to specialize the code. The direct call to $g$ is guarded by a condition that checks if $g$ and the type of its parameters remains the same as observed when the OSR was fired. If the guard fails, the code falls back to executing the original \feval\ instruction.
This method overcomes the limitations of JIT-based specialization, supporting optimization of \feval$(g,...)$ calls in functions that do not receive $g$ as a parameter.
\fi
\ifdefined\fullver
The first one is based on OSR: using the McOSR library~\cite{lameed2013modular}, \feval\ calls inside loops are instrumented with an OSR point and profiling code to cache the last-known types for the arguments of each \feval\ instruction. When an OSR is fired at run-time, a code generator modifies the original function by inserting a guard to choose between a fast path containing a direct call and a slow path with the original \feval\ call. The second technique is less general and uses value-based JIT compilation: when the first argument of an \feval\ call is an argument of the enclosing function, the compiler replaces each call to this function in all of its callers with a call to a special dispatcher. At run-time, the dispatcher evaluates the value of the argument to use for the \feval\ and executes either a previously compiled cached code or generates and JIT-compiles a version of the function optimized for the current value.