TL;DR DynASM is a toned down JIT compiler which directly generates target assemb...

haberman · on Jan 2, 2013

> Retargeting and optimizations will be hard.

What evidence do you have for this? I have evidence to the contrary: LuaJIT is a one-man effort and yet is one of the fastest and most portable JITs around (x86, x86-64, ARM, PowerPC, MIPS).

> I'm looking forward the next installment, where (hopefully!) one generates target independent code via LLVM, then uses one of the LLVM backends to generate the final assembly code.

LLVM is very cool, but it is an absolute mistake to think of it as obsoleting all other JITs. LLVM uses an IR that is well-suited to some things but not others. If LLVM fits your problem, great. But many problems it does not fit as well -- just look at Unladen Swallow, and notice that none of the mainstream JavaScript JITs use LLVM (not V8, not IonMonkey, not Nitro).

LLVM's design tightly couples an IR with a machine code generator. If you use DynASM, you can write your own machine code generator that accepts whatever IR is best suited to your problem.

pacala · on Jan 2, 2013

I'm assuming that emitting "movzx edi, byte [PTR]" is using x86 as the target, thus retargeting for ARM will likely require a complete rewrite of the brainf#ck jit. In that sense retargeting is hard. But I may be wrong! I am looking for a further article that shows how the brainf#ck jit can be retargeted to ARM without a full rewrite of the jit code.

From the jit code that generates assembly tied to a specific architecture and register allocation, plus the code generation process encoded as a preprocessor step instead of a library I can only deduce that optimizations aren't the focus of this work. But I may be wrong! Perhaps the preprocessor is syntactic sugar over a library that build the code representation as a data structure and there are ways to programmatically manipulate this data structure to implement optimizations. Looking forward for a further article with more details!

I'm not suggesting you necessarily use LLVM, but LLVM is the closest to a assembly generator library I am aware of. To the best of my knowledge, you'd have a harder time extracting the code generator of, for example, v8 as a standalone library.

haberman · on Jan 2, 2013

It is true that this approach requires a separate code generator for every architecture. That is not the same as saying that "retargeting will be hard" (which makes it sound like DynASM somehow gets in your way).

Yes, as I said before, if you have a problem that maps cleanly onto LLVM and you don't mind the weight that LLVM brings along, by all means use it! But you shouldn't think of LLVM as an "assembly generator library." That implies that it is far more general-purpose than it actually is. DynASM is actually an assembly generator library. LLVM is an IR, a set of optimization passes for that IR, and a set of target-specific code generators for that IR. The key point is "for that IR."

DynASM is a tool that you can count on when no existing IR's like LLVM, .NET, etc. fit your needs. It's a lower-level tool -- LLVM could conceivably use DynASM to perform its own target-specific instruction encoding. DynASM is a small, focused tool that does one thing and does it well. LLVM is more of a toolbox that tries to get the 99% case right for its target audience. As a result, it represents a lot more compromises and changes in more fundamental ways over time (for example, it recently completely rewrote its register allocator).

> Perhaps the preprocessor is syntactic sugar over a library that build the code representation as a data structure and there are ways to programmatically manipulate this data structure to implement optimizations.

No, definitely not. The idea is that you perform optimizations before the code generation step. I didn't do this in the article because these were just simple "Hello, World" examples, but maybe I should write a follow-up article that illustrates how optimization fits into this framework.