简体   繁体   中英

How to reliably influence generated code at near machine level using GHC?

While this may sound as theoretical question, suppose I decide to invest and build a mission-critical application written in Haskell. A year later I find that I absolutely need to improve performance of some very thin bottleneck and this will require optimizing memory access close to raw machine capabilities.

Some assumptions:

  • It isn't realtime system - occasional latency spikes are tolerable (from interrupts, thread scheduling irregularities, occasional GC etc.)
  • It isn't a numeric problem - data layout and cache-friendly access patterns are most important (avoiding pointer chasing, reducing conditional jumps etc.)
  • Code may be tied to specific GHC release (but no forking)
  • Performance goal requires inplace modification of pre-allocated offheap arrays taking alignment into account (C strings, bit-packed fields etc.)
  • Data is statically bounded in arrays and allocations are rarely if ever needed

What mechanisms does GHC offer to perfom this kind of optimization? By saying reliably I mean that if source change causes code to no longer perform, it is correctible in source code without rewriting it in assembly.

  • Is it already possible using GHC-specific extensions and libraries?
  • Would custom FFI help avoid C calling convention overhead?
  • Could a special purpose compiler plugin do it through a restricted source DSL?
  • Could source code generator from a "high-level" assembly (LLVM?) be solution?

It sounds like you're looking for unboxed arrays. "unboxed" in haskell-land means "has no runtime heap representation". You can usually learn whether some part of your code is compiled to an unboxed loop (a loop that performs no allocation), say, by looking at the core representation (this is a very haskell-like language, that's the first stage in compilation). So eg you might see Int# in the core output which means an integer which has no heap representation (it's gonna be in a register).

When optimizing haskell code we regularly look at core and expect to be able to manipulate or correct for performance regressions by changing the source code (eg adding a strictness annotation, or fiddling with a function such that it can be inlined). This isn't always fun, but will be fairly stable especially if you are pinning your compiler version.

Back to unboxed arrays: GHC exposes a lot of low-level primops in GHC.Prim, in particular it sounds like you want mutable unboxed arrays ( MutableByteArray ). The primitive package exposes these primops behind a slightly safer, friendlier API and is what you should use (and depend on if writing your own library).

There are many other libraries that implement unboxed arrays, such as vector , and which are built on MutableByteArray , but the point is that operations on that structure generate no garbage and likely compile down to pretty predictable machine instructions.

You might also like to check out this technique if you're doing numeric work and want to use a particular instruction or implement some loop directly in assembly.

GHC also has a very powerful FFI, and you can research about how to write portions of your program in C and interop; haskell supports pinned arrays among other structures for this purpose.

If you need more control than those give you then haskell is likely the wrong language. It's impossible to tell from your description if this is the case for your problem (Your requirements seem contradictory: you need to be able to write a carefully cache-tuned algorithm, but arbitrary GC pauses are okay?).

One last note: you can't rely on GHC's native code generator to perform any of the low-level strength reduction optimizations that eg GCC performs (GHC's NCG will probably never ever know about bit-twiddling hacks, autovectorization, etc. etc.). Instead you can try the LLVM backend, but whether you see a speedup in your program is by no means guaranteed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM