Less Is More: Merging AST Nodes To Optimize Interpreters
The Truffle framework allows language implementations to reach state-of-the-art run time performance while only providing an abstract syntax tree (AST) interpreter; the AST is compiled to machine code using GraalVM’s JIT compiler. However, it can take some time before this fully optimized code is generated and executed: startup performance is consequently tightly bound to interpreter performance, therefore reducing the execution time requires us to improve interpreter performance instead of solely focusing on the JIT compiler.
Through the use of a novel technique called supernodes, we take steps towards improving the run-time performance of Truffle-based interpreters, aiming to reduce programs’ overall execution time by improving interpreter performance. We take inspiration from a well-known bytecode interpreter optimization technique: superinstructions, which reduce the instruction dispatch overhead by concatenating existing instructions. In the case of supernodes, AST nodes are merged, creating a single entity which has the same behavior as the original tree.
The main performance gain of supernodes comes from removing redundant node safety guards: for instance, regular nodes have to check the type of their input data through a type guard, even though it may have already been determined by another node which had no way of sharing this information. Supernodes avoid this problem through unifying the node contexts. Moreover, similarly to superinstructions, performance is also gained through reducing the node dispatch overhead as a result of using fewer nodes.
So far, we have been relying on the research language TruffleSOM, a minimal Smalltalk dialect implemented using Truffle, and we are focusing on the Are We Fast Yet benchmark suite, which contains both micro- and macro-benchmarks. Initial results are promising, showing up to 42% run time reduction with the addition of 20 supernodes. Moreover, there are currently no performance regressions for any of our benchmarks compared to a baseline with no supernode candidates available.
However, node trees that yield valuable supernodes are currently detected manually, and the process of generating supernodes from them is currently not automated. In the future, we aim to develop heuristics to identify potential supernode candidates: both at parsing time through static analysis, as well as later during run time through detection of frequently called nodes that could yield performance gains should they be merged together. The supernodes would then be generated based on the code of the selected AST nodes. In addition, we aim to port our technique to more complex Truffle implementations used in the industry, such as GraalPython or TruffleRuby.
Tue 22 MarDisplayed time zone: Lisbon change
10:30 - 12:00
|Interpreter Register Autolocalisation: Improving the performance of efficient interpreters|
Guillermo Polito CNRS; CRIStAL; University of Lille; Centrale Lille; Inria, Pablo Tesone Inria Lille–Nord Europe, France Mines Douai, IA, Univ. Lille, France, Stéphane Ducasse Inria; University of Lille; CNRS; Centrale Lille; CRIStAL, Nahuel Palumbo Université Lille, CNRS, Centrale Lille, Inria, UMR 9189 - CRIStAL, Soufyane Labsari Université Lille, CNRS, Centrale Lille, Inria, UMR 9189 - CRIStALFile Attached
|Less Is More: Merging AST Nodes To Optimize Interpreters|
MoreVMsMedia Attached File Attached
|Towards Just-in-time and Language-agnostic Mutation Testing|
Stefan Reschke Hasso Plattner Institute, University of Potsdam, Toni Mattis Hasso Plattner Institute, University of Potsdam, Fabio Niephaus Oracle Labs, Potsdam, Robert Hirschfeld HPI, University of PotsdamFile Attached