Authorea

Camil Demetrescu over 8 years ago

Commit id: 1f8c623b7f3016d946e08335897a79f36c62af7b

deletions | additions

\end{small} \end{table} Unfortunately, we are unable to compute direct performance metrics for the solution by Lameed and Hendren since its source code has not been released. Numbers Figures in their paper~\cite{lameed2013feval} show that for these benchmarks the speed-up of the OSR-based approach isequal on average to a within $30.1\%$percentage of the speed-up from of hand-coded calls, ranging optimization (ranging from $9.2\%$ to $73.9\%$; $73.9\%$); for the JIT-based approach approach, the averagepercentage grows to $84.7\%$, ranging $84.7\%$ (ranging from $75.7\%$ to $96.5\%$. $96.5\%$). Our optimization technique yields speed-ups that are very close to the upper bound given from by-hand optimization; in the worst case - {\tt odeRK4} benchmark - we observe a $94.1\%$ when the optimized code is generated on-the-fly, which becomes $97.5\%$ when a cached version is available. Compared to their OSR-based approach, the compensation entry block is a key driver of improved performance, as the benefits from a better type-specialized whole function body outweigh those from performing a direct call using boxed arguments and return values in place of the original \feval. Our optimization technique yields speed-ups that are very close to the upper bound given from by-hand optimization; in the worst case - {\tt odeRK4} benchmark - we observe a $94.1\%$ percentage when the optimized code is generated on-the-fly, which becomes $97.5\%$ when a cached version is available. Compared to their OSR-based approach, the compensation entry block is a key driver of improved performance, as the benefits from a better type-specialized whole function body outweigh those from performing a direct call using boxed arguments and return values in place of the original \feval.