Borys Bradel's Blog
Making Java programs work on LogTM
Tags: programming November 21, 2009
Part of my PhD required that I figure out how to get Java programs to execute on LogTM, which is a hardware transactional memory system that has a simulator on top of Simics. Let's just say that achieving that feat is not easy. The following is a somewhat scattered account of what needs to be avoided or done to get things to work. The JVM being used is GCJ.
The following will cause LogTM to pretty much go into an infinite loop:
- Java Native Interface (JNI) calls,
- synchronization,
- multiplies,
- divides,
- calls to virtual methods,
- writing repeatedly to the same memory location by more than one processor, and
- nested transactions.
Of course, there is also the issue of false aliasing at a cache line level, such as when the fields of two adjacent objects or the length and first few values of an array may alias.
I had to make sure that in my framework
- all data accessed in transactions was accessed at initialization time,
- all written memory locations are on separate cache lines from other data, and
- all written memory locations are on separate cache lines from all internal data (e.g. information about the size of the array).
Finally, in the benchmark code I tried to have all variables on separate cache lines. Interestingly, that caused enough cache misses that the performance was worse than if there were conflicts.
Everything except for removing the multiplies and divides was essential to getting the benchmark to work correctly.
To figure all that out, I had to match a stream of assembly instructions from the simulator log data with the disassembly produced by the mdb and gdb debuggers and then match that with Java instructions and variables, with the only information being approximately what line in the Java source code (give or take since the code is optimized at -O3) an assembly instruction could correspond to. The items on the list above were discovered and fixed one at a time. That took quite a long time.
On top of what I've written above, I had to make sure that
- the entire working set fit within the cache (I increased the cache), and
- I replaced all instanceof tests, abstract methods, and interfaces (because they cause good pointers to turn null, this behaviour was truly bizarre) with tests against an instance enumeration variable, non-abstract methods, and parent classes respectively.
Finally, it seems that certain native calls (the method that was causing the problem is Math.log, which is just a wrapper for a native method) cause some internal initialization functions to be called the first time they are executed. Therefore the first execution must be done outside of transactions. To get around that I made sure that the method was called at the start of each thread. Also, to avoid JNI, I created my own wrapper method that calls the log() function based on CNI.
Copyright © 2009 Borys Bradel. All rights reserved. This post is only my possibly incorrect opinion.