Here are the 32-bit re-runs with the experimental siever (on a Phenom940):
side	lp	siever	yield1	sec/rel	yield2	sec/rel
alg	32	15e	6237	2.29344	6238	(1.01389 sec/rel)
rat	32	15e	8559	1.99026	8558	(0.87265 sec/rel)

alg	32	16e	13320	2.79733	13322	(1.29829 sec/rel)
rat	32	16e	17536	2.57357	17537	(1.16108 sec/rel
The time/rel are simply proportional (the CPU and the code is different).
The number of relns is +- a few, too; this is a known effect.
The memory footprint, though, for 15e and 16e was
586m (1031m virt.) for 15e
682m (1446m virt.) for 16e

Not as bad as with old memory allocations. Not 4g/1g!

Note, that I have not quite figured out all new Kleinjung code (lasieve5), indeed there are many changes there ...and there are 15e, 15f, 15g siever variants!
