[cacao] hotspot under the microscope
Edwin Steiner
edwin.steiner at gmx.net
Mon Nov 13 21:35:50 CET 2006
Hi all!
As the Hotspot VM is now free software (may Sun live long and prosper!
;) I built myself a debugging hotspot at once to play with it.
Building is very easy, but I found out that the disassembler does not
seem to be supplied. So I quickly hacked in the cacao disassembler, and
it works! :)
Some interesting results:
*)
In _201_compress, there is this << 16 >>> 16 stuff in Code_Table.of.
Look what hotspot does with that:
0xb4d5f97a: 8b 69 08 mov 0x8(%ecx),%ebp
0xb4d5f97d: 8b 4d 08 mov 0x8(%ebp),%ecx ; implicit exception: dispatches to 0xb4d5f9a9
0xb4d5f980: 3b d1 cmp %ecx,%edx
0xb4d5f982: 73 10 jae 0x00000000b4d5f994 ;*saload
; - spec.benchmarks._201_compress.Code_Table::of at 5
0xb4d5f984: 0f b7 44 55 0c movzwl 0xc(%ebp,%edx,2),%eax ;*iushr
; - spec.benchmarks._201_compress.Code_Table::of at 11
See: The load, the ISHL, and the IUSHR all combined into a single movzwl
instruction. I wonder if they just did that for compress, or if this is
a common Java idiom...
*)
They have a strange polling at RETURNs, which I don't yet understand.
Probably it only makes sense after the code is patched...
*)
They do a lot CISC things. For example you see things like obj.field +=
value compiled to "add %value, (mem)".
*)
String.hashCode has lots of loop unrolling. It seems to be a native
function, but I'm not entirely sure.
Looks like this, repeated all over (2 iterations shown):
0xb4d60978: 03 4c 24 28 add 0x28(%esp),%ecx
0xb4d6097c: 8b e9 mov %ecx,%ebp
0xb4d6097e: c1 e5 05 shl $0x5,%ebp
0xb4d60981: 2b e9 sub %ecx,%ebp
0xb4d60983: 03 6c 24 2c add 0x2c(%esp),%ebp
0xb4d60987: 8b cd mov %ebp,%ecx
0xb4d60989: c1 e1 05 shl $0x5,%ecx
0xb4d6098c: 2b cd sub %ebp,%ecx
*)
What's very interesting is their object header. It has only two words.
They swap out the header for locked objects, similar to what ElectricalFire
VM does (I posted links in the wiki about that earlier). Apart from a few
bits for GC and locking, the one header word is mostly the hash value.
The other word is the vftbl, of course.
They use a biased locking algorithm that does not need atomic operations
for the one thread it is biased towards.
-Edwin
More information about the cacao
mailing list