Thread: Intel Xeon PHI? View Single Post
2021-07-20, 04:03   #205
paulunderwood

Sep 2002
Database er0rr

53·73 Posts

Quote:
 Originally Posted by kriesel Xeon Phi does pretty well with 4 workers, one designated for P-1 on 12 GB of MCDRAM.
It's a tricky balancing act. More throughput vs. find a stage 2 factor. Here is the (edited) benchmark I ran:

Code:
Prime95 64-bit version 30.3, RdtscTiming=1
Timings for 6048K FFT length (64 cores, 1 worker):   Throughput: 218.85 iter/sec.
Timings for 6048K FFT length (64 cores, 2 workers):  Throughput: 432.09 iter/sec.
Timings for 6048K FFT length (64 cores, 4 workers):  Throughput: 602.49 iter/sec.
Timings for 6048K FFT length (64 cores, 8 workers):  Throughput: 654.40 iter/sec.
Timings for 6048K FFT length (64 cores, 16 workers): Throughput: 666.12 iter/sec.
Timings for 6048K FFT length (64 cores, 32 workers): Throughput: 693.20 iter/sec.
Timings for 6048K FFT length (64 cores, 64 workers): Throughput: 717.88 iter/sec.
I get ~10% more throughput by running 16 workers compared to 4 workers, but I don't get to do stage 2 P-1. Also, I am not prepared to wait for a ~4 months to turn around 64 candidates! Plus the current HDD in the Phi is not very big.

Edit: I plan to install a 500GB disk and run 64 workers. With mprime's nice runtime auto-tuning I should get 64 candidates done in ~96 days. Presumably, I just have to alter some settings and I can continue with what I have done so far as well as new work.

Last fiddled with by paulunderwood on 2021-07-20 at 06:21