this is for holding javascript data
dcdelia
over 8 years ago
Commit id: bd0eeeea870fdbdd9f72d1a0003164a2aee59459
deletions | additions
diff --git a/artifact/session1.tex b/artifact/session1.tex
index 5eedec3..d6d5881 100644
--- a/artifact/session1.tex
+++ b/artifact/session1.tex
...
\begin{small}
\begin{verbatim}
.lr.ph: ; preds = %2, %0
%i.01 = phi i64 [ %10, %2 ], [ 1, %0 ]
%4 = getelementptr inbounds i64* %v, i64 %i.01
%.sum = add nsw i64 %i.01, -1
...
\end{verbatim}
\end{small}
\noindent \tinyvm\ will {\tt UPDATE} the function in the following way: an {\tt ALWAYS}-true OSR condition is verified before executing instruction {\tt \%4}, firing an {\tt OPEN} OSR transition
in to the {\tt DYN\_INLINE} code generator that will inline any indirect function call to the function pointer {\tt \%c}. We choose {\tt \%4} as location for the OSR as
its it is the first non-$\phi$ instruction in the loop body, and we hint the LLVM
backend back-end through profiling metadata that the OSR firing is {\tt 100}\%-likely.
The IR will now look like:
\begin{small}
\begin{verbatim}
.lr.ph: ; preds = %2, %0
%i.01 = phi i64 [ %10, %2 ], [ 1, %0 ]
%alwaysOSR = fcmp true double 0.000000e+00,
0.000000e+00
br i1 %alwaysOSR, label %OSR_fire,
label %OSR_split, !prof !1
OSR_split: ; preds = %.lr.ph
%4 = getelementptr inbounds i64* %v, i64 %i.01
%.sum = add nsw i64 %i.01, -1
[...]
OSR_fire: ; preds = %.lr.ph
%OSRCast = bitcast i32 (i8*, i8*)* %c to i8*
%OSRRet = call i32 @isord_stub(i8* %OSRCast,
i64* %v, i64 %n,
i32 (i8*, i8*)* %c,
i64 %i.01)
ret i32 %OSRRet
\end{verbatim}
\end{small}
\noindent\osrkit\ has split the {\tt \%.lr.ph} basic block at the OSR point, also adding an {\tt OSR\_fire} block to transfer the execution state to {\tt isord\_stub} and eventually return the {\tt OSRRet} value.
We can now let {\tt isord} run on a dynamically initialized array through the {\tt driver} method, which takes as argument the array length to use. The method will populate with elements ordered for the comparator in use (see {\small\tt inline.c}). For instance, we will ask {\tt driver} to set up an array of $100000$ elements and run {\tt isord} on it:
\begin{small}
\begin{verbatim}
TinyVM> driver(100000)
Time spent in creating continuation function:
0.000252396 seconds
Address of invoked function: 140652750196768
Function being inlined: cmp
Elapsed CPU time: 0 m 0 s 3 ms 417 us 157 ns
(that is: 0.003417157 seconds)
Evaluated to: 1
\end{verbatim}
\end{small}
\noindent The method returns $1$ as result, which means that the vector is ordered. Compared to \myfigure\ref{fig:isordascto}, IR code generated for the OSR continuation function {\tt isordto} ({\tt DUMP isordto}) is slightly different as the MCJIT compiler detects that additional optimizations (e.g., loop strength reduction) are possible and performs them. We expect code generated for {\tt isord\_stub} to be identical up to renaming to the IR reported in \myfigure\ref{fig:isordstub}.
To show native code generated by the MCJIT back-end, we can run \tinyvm\ under {\tt gdb} and leverage the debugging interface of MCJIT. For instance, once {\tt driver} has been invoked, we can switch to the debugger with {\tt CTRL-Z} and display the x86-64 code for any compiled method with:
\begin{small}
\begin{verbatim}
(gdb) disas isordto
Dump of assembler code for function isordto:
[Base address: 0x00007ffff7ff2000]
<+0>: mov -0x8(%rdi,%rcx,8),%edx
<+4>: sub (%rdi,%rcx,8),%edx
<+7>: xor %eax,%eax
<+9>: test %edx,%edx
<+11>: jg 0x7ffff7ff201a
<+13>: inc %rcx
<+16>: mov $0x1,%eax
<+21>: cmp %rsi,%rcx
<+24>: jl 0x7ffff7ff2000
<+26>: retq
End of assembler dump.
\end{verbatim}
\end{small}
Assuming that the steps described above are executed
Native code [...] {\tt gdb} [...]
%In a usage scenario in which input arrays are large, we might want to perform dynamic inlining as early as possible. We can thus insert