dcdelia  over 8 years ago

Commit id: 6d4338d3aca4637cb73fae1ab43ccd7007de5979

deletions | additions      

       

\item {\tt many/X}: multiple runs of original code (for code caching).  \end{itemize}  \noindent We manually collected figures from the console output and computed speedups for the different settings. We show how to run the code using {\tt odeEuler} odeRK4}  as an example. The platform used to obtain reported numbers is the same as in session 2. To determine a baseline for speedup computation, we let {\tt mcvm} perform a single run of the original code with no {\tt feval} optimization. Note that we can selectively enable or disable {\tt feval} optimization using the {\tt -jit\_feval\_opt} flag: 

\begin{verbatim}  $ cd ~/Desktop/mcvm  $ ./mcvm -jit_feval_opt false <  benchmarks/scripts/base/odeEuler benchmarks/scripts/base/odeRK4  ***********************************************  McVM - The McLab Virtual Machine v1.0   Visit http://www.sable.mcgill.ca for more info.  ***********************************************  >: >: Compiling function: "testSH"  Compiling function: "odeEuler" "odeRK4"  Compiling function: "testSHfun"  Compiling function: "rhsSteelHeat"  Compiling function: "testSHfun"  Compiling function: "rhsSteelHeat"  [TOC] Elapsed time: 32.552112 32.866556  seconds t y_Euler y_RK4  0.0000 1.000000  80.0000 679.644212 20.0000 227.364633  \end{verbatim}  \end{small}  \noindent To measure the performance of McVM code caching mechanism, we let the benchmark run multiple times in the same instance of the VM:  \begin{small}  \begin{verbatim}  $ ./mcvm -jit_feval_opt false <  benchmarks/scripts/many/odeRK4  \end{verbatim}  \end{small}  \noindent The experiment duration on our platform was $\approx2$m, with a time per trial of $\approx32.536$s (discarding the warm-up run). The resulting speedup for the base code caching mechanism was thus $32.866/32.536=1.010$, slightly different than the one reported in \mytable\ref{tab:feval} on the Intel Xeon platform, for which we repeated each experiment $10$ times.  We can now set an upper bound for speedups by measuring the running time when the code has been optimized by hand inserting direct calls in place of {\tt feval} instructions:  \begin{small}  \begin{verbatim}  $ ./mcvm < benchmarks/scripts/direct/odeRK4  \end{verbatim}  \end{small}