Thanks for the testing, Zak!
Even with binaries that aren't optimized for the CPU, it is 40% (Step1) to over 60% (Step2) faster than a P4 Prescott - at least for this first test.
I wonder how the Core2 performs in relation to an Athlon64. I guess that currently, the A64 will pull away (due to assembly support), but the Core2 could be a real competitioner once the software is optimized.