Authorea

Camil Demetrescu edited osr-llvm.tex over 8 years ago

Commit id: 6162c4a964d7261abd3cb8b2e0c24c9aadd09344

deletions | additions

\end{figure} \fi \paragraph{OSR Instrumentation.} \section{OSR Instrumentation in IR.} \label{se:ir-osr-instr} We use deferred compilation by instrumenting {\tt isord} with an open OSR at the beginning of the loop body, as shown in \myfigure\ref{fig:isordfrom}. Portions added to the original code by OSR istrumentation are highlighted in grey\footnote{Virtual register names and labels in the LLVM-produced IR code have been refactored to make the code more readable.}. %The figure illustrates how the original {\tt isord} code is instrumented by \tinyvm, highlighting in grey the added portions. A new basic block is placed at the beginning of the loop body, which increments a hotness counter {\tt p.osr} and jumps to an OSR-firing block if the counter reaches the threshold (1000 iterations in this example). The OSR block contains a tail call to the target generation stub, which receives as parameters the four live variables at the OSR point ({\tt v}, {\tt n}, {\tt i}, {\tt c}). Notice that maintaining the SSA form requires adjusting $\phi$-nodes. The stub (see \myfigure[...]) calls a code generator that: 1) builds an optimized version of {\tt isord} by inlining the comparator (which is known when the OSR is fired), and 2) uses it to create the continuation function {\tt isordto} shown in \myfigure\ref{fig:isordascto}. The stub terminates with a tail call to {\tt isordto}. To generate the continuation function from the optimized version created by the inliner, we need to replace the function entry point, remove dead code, replace live variables with the function parameters, and fix $\phi$-nodes accordingly. Additions resulting from the IR instrumentation are in grey, while removals are struck-through.

\fi \paragraph{x86-64 Lowering.} \label{se:ir-x86-lowering} %The final step to be performed before execution is native code generation. \myfigure\ref{fig:isordx86-64} shows the x86-64 code generated by LLVM for {\tt isordfrom} and {\tt isordto}. For the sake of comparison with the native code that would be generated for the original non-OSR versions, additions resulting from the IR instrumentation are in grey, while removals are struck-through. Notice that the OSR intrusiveness in {\tt isordfrom} is minimal, consisting of just two assembly instructions with register and immediate operands. As a result of induction variable canonicalization in the LLVM back-end, loop index {\tt i} and hotness counter {\tt p.osr} are fused in register {\tt\%r12}. We also note that tail call optimization is applied in the OSR-firing block, resulting in no stack growth during an OSR. The continuation function {\tt isordto} is identical to the optimized version of {\tt isord}, except that the loop index is passed as a parameter in {\tt \%rdx} and no loop pre-header is needed since OSR jumps directly in the loop body.