Thursday, February 17, 2011

binary by the numbers

New home

I've taken the liberty to fork the darcs repo of binary and put it on github. It contains both the latest released version as well as the new experimental continuation based Get monad a branch called cps.

git clone git://github.com/kolmodin/binary

Performance

It's interesting to run the benchmark of binary on different architectures and with different versions of GHC. Although there recently has been work within the community with fast writing (blaze-builder comes to mind) I've mostly been working on how to read things fast.

The classic binary implementation of the Get monad is a state monad while the new experimental version is continuation based, so fundamentally different. They also perform differently. To produce the numbers below I ran the benchmark suite of binary. It reads either Word8, Word16, Word32 or Word64 in a (hopefully) tight loop and then presents how fast it could do it. For example, see this graph over performance in a 32bit environment;

The nice news is that GHC 7.0.1 always performs better than GHC 6.12.3. Also the experimental cps branch (the wide green line) is faster than the classic master branch.

Things seems to be going well in 32bit land. Let's have a look in a 64bit environment;

This gives a different picture. GHC 7.0.1 still performs better than GHC 6.12.3, but we can also see that the cps branch can't keep up with the state monad based master branch (in contrast to when compiling for 32bits). Future work will include to figure out why, and how to fix it.

Lets have a look at how binary performs at writing too;

Benchmark Environment

The tests have been performed on a Sandy Bridge CPU using GHCs native backend. I wanted to try the LLVM backend too, but unfortunately LLVM crashes when compiling the benchmark executable.

9 comments:

Johan Tibell said...

Since you're working on CPS, you might be interested in a GHC 7 bug Bryan just reported: http://hackage.haskell.org/trac/ghc/ticket/4965

Lennart Kolmodin said...

The bug report titled "60% performance regression in continuation-heavy code between 6.12 and 7"?
Yes, that would interest me :) Thanks!

Sergey said...

<offtopic>
The line plots show that writing Word45 is as fast as writing writing Word23 on Linux32. (Bar charts would be more suitable here)
</offtopic>

Johan Tibell said...

GHC HQ might also be interested to hear about the large 32 vs 64-bit difference.

Unknown said...

Another relevant ticket: http://hackage.haskell.org/trac/ghc/ticket/4978

David T said...

Do you know if the crash with the LLVM backend still happens in GHC 7.0.3. If so could you submit a bug report to GHC please.

Lennart Kolmodin said...

Good idea. I should do the tests again and report if the troubles still occur.

IIRC the result has also differed on different machines, maybe different LLVM versions. As I remembered it I had sometimes trouble with the exact some code using the same version of GHC, but in different distros. If I can repeat the inconsistency I'll mention that too.

David T said...

Thats good (although troubling to hear). I got a bug report a while back which I wasn't able to reproduce with the only difference between my setup and the reporters was Ubuntu vs Arch Linux. Hoping when I have time soon to get some virtual machines up to investigate these failures further.

Lennart Kolmodin said...

Ah, yes. That bug report looks very familiar to what I got, "Cannot yet select..." something something. I'll reproduce it and give you the details when I've got access to that machine again. I had trouble with Fedora 14 while Gentoo worked as expected.