Daniele Cono D'Elia edited intro.tex  over 8 years ago

Commit id: 9360769895dd5c29c8bf4a18e4459db261e1e8f3

deletions | additions      

       

\section{Introduction}  \label{se:intro}  The LLVM compiler infrastructure provides a Just-In-Time compiler called MCJIT that is currently being used for generating optimized code at run-time in virtual machines for dynamic languages. MCJIT is employed in both industrial and research projects, including Webkit's Javascript engine, the open-source Python implementation Pyston, the Rubinius project for Ruby, Julia for high-performance technical computing, McVM for MATLAB,  CXXR for the R language, Terra for Lua, and the Pure functional programming language. The MCJIT compiler shares the same optimization pipeline with static compilers such as clang, and it provides dynamic features such as native code loading and linking, as well as a customizable memory manager. A piece that is currently missing in the environment is a feature to enable on-the-fly transitions between different versions of a running program's function. This feature is commonly known as On-Stack-Replacement (OSR) and is typically used in high-performance virtual machines, such as HotSpot and the Jikes RVM for Java, to interrupt a long-running function and recompile it at a higher optimization level. OSR can be a powerful tool for dynamic languages, for which most effective optimization decisions can typically be made only at run-time, when critical information such as type and shape of objects becomes available. In this scenario, OSR becomes useful also to perform optimization, i.e. when the running code has been speculatively optimized and the assumption used for the optimization does not hold anymore, the optimized function is interrupted and the execution continues in a safe version of the code. 

\item Supporting OSR transitions in terms of instrumentation of pure IR code only, avoiding manipulations at machine-code level.  \item Incurring a minimal level of intrusiveness in terms of both the instrumentation of the code generated by the front-end and the degree of optimization opportunities influenced by the presence of OSR points.  \item Relying on LLVM's compilation pipeline to generate the most efficient native code for an instrumented function.  \item %\item  Providing support for the redirection of future invocations of a function to the latest compiled version without recompiling the callers or performing linking again. \end{itemize}  Our implementation is shipped as a library for IR manipulation, and we present a preliminary experimental study of our technique in TinyVM, a proof-of-concept virtual machine for run-time IR manipulation and compilation based on MCJIT. We then present a case study on the integration of our technique in McVM~\cite{chevalier2010mcvm}: we show the potential of our approach by enabling an aggressive specialization mechanism for the {\tt feval} construct - a source of bottlenecks in many MATLAB programs - that could not have been implemented using extant OSR techniques.  The rest of this paper is organized as follows. In Section