This perl5 jitter is super-simple, and modeled after B::CC

The compiled perl5 optree is a linked list in memory in non-execution order,
wide-spread jumps. Additionally the calls are indirect. The jitter properly
aligns the run-time calls in linear linked-list "exec" order, so that the CPU
can prefetch the next instructions, and it inlines some simple ops.
op_next targets (returned by false conditions) are favored over op_other and
other targets.

IT DOES NOT WORK YET!
It does only work yet for non-threaded simple functions! No subs, no maybranch ops

Faster jitted execution path without runops loop, selected with -MJit or later,
when stable with perl -j.

All ops are unrolled in execution order for the CPU cache,
prefetching is the main advantage of this function. The perl5 runloop has 
no chance to get cached at all.
For < 5.13 the ASYNC check is only done when necessary.

For now only implemented for x86 and amd64/x86_64 with certain 
hardcoded my_perl offsets when threaded.

C pseudocode

x86 not-threaded, PL_op in eax, PL_sig_pending temp in ecx

prolog:
	55                   	push   %ebp
	89 e5                	mov    %esp,%ebp
	53                    	push   %rbx 
call:
	e8 xx xx xx xx		call   $PL_op->op_ppaddr #relative
save_plop:
	a3 xx xx xx xx       	mov    %eax,$PL_op

dispatch_getsig:
	8b 0d xx xx xx xx	mov    $PL_sig_pending,%ecx
dispatch:
	85 c9                	test   %ecx,%ecx
	74 06                	je     nextcall
	e8 xx xx xx xx          call   *Perl_despatch_signals #relative
epilog:
	b8 00 00 00 00       	mov    $0x0,%eax 	# clean PL_op
	5b                   	pop    %rbx 
	5d               	pop    %ebp
	c3                   	ret

If op maybranch (Opcodes-0.04), create call of other op, check PL_op before with after 
and branch to label of other op.

Problems
far calls to the pp ops break code prefetching, so we have to inline as much as 
possible, similar to B::CC. Easy to jit are only nextstate, enter, and skip null.
The best jitter would be a B::CC to assembler backend, but this is hard to get right.

Porting
I created the asm with cc_main and cc_main_nt, see Makefile for objdump and cc_harness 
rules for gcc assembly.

ASM links

http://www.lxhp.in-berlin.de/lhplinks.html
http://blogs.msdn.com/freik/archive/2005/03/17/398200.aspx
http://msdn.microsoft.com/en-us/library/7kcdt6fy.aspx
http://asm.sourceforge.net//resources.html
http://www.intel.com/design/itanium/manuals/iiasdmanual.htm
http://www.heyrick.co.uk/assembler/qfinder.html

HL jitters

parrot
luajit
psyco / pypy
tracemonkey
ruby
clisp

JIT libs

lightning - c macros only
libjit - c lib
llvm - compiler framework + lib