Thread: Intel Xeon PHI?
View Single Post
Old 2020-11-30, 20:00   #136
ewmayer's Avatar
Sep 2002
Rep├║blica de California

3·72·79 Posts

Originally Posted by kriesel View Post
Mine also has the unusual power supply dimensions Ernst describes, and the physical mounting is unimpressive, involving 2 screws at one end and a zip tie at the other that still leaves it a big wiggly.
Yeah, that was a rather shoddy job PSU securement. Does your system also have the power plug which likes to disconnect at the slightest jiggling of the case?

Does the new Mlucas P-1 code support only Fermats, or also Mersennes?
When will the new code be available for others to use?
It will support both - as I noted before, there's very little difference between p-1 for both kinds, assuming one has already got the different specialized FFT-modmul routines for the 2 different moduli in place. I got basic p-1 Stage code working last week, working on properly integrating that code into the production-mode front end and modifying the savefile-restart mechanism for p-1 this week. Once that's in place, firing up a p-1 Stage 1 for F33 on the KNL should be a relatively trivial matter, then let that run and run while I work on Stage 2 code. No specific timeframe I can give at present - hope to have all the p-1 work done by EOY, then on the other major new feature for v20, PRP-proof support. I will likely make the v20-with-p-1-only-added code available for build&test while I work on PRP-proof support.

Mlucas on CentOS: 170ms/iter x four 16-thread instances of 64M fft length corresponds to 4 x 1000 / 170 = 23.53 iters/sec throughput on 64 of the 68 cores. (At what average clock rate?)
Here is the first entry in my /proc/cpuinfo system file:
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 87
model name	: Intel(R) Xeon Phi(TM) CPU 7250 @ 1.40GHz
stepping	: 1
microcode	: 0x1b0
cpu MHz		: 1501.193
cache size	: 1024 KB
Or were you perhaps referring to possible auto-downclocking-under-load? If that is a possibility, I'll have to dig out how to get the "live GHz" numbers under Linux.

Prime95 on Windows 10 at the same 64M fft length benchmarked as 25.37 iters/sec throughput on all 68 cores. Straight line interpolating down to 64 cores would give 22.15 iters/sec, which is probably a bit pessimistic.The indicated throughput Mlucas vs. prime95 is within 8%, one way or the other.
Thanks, closer than I'd hoped. My box still a ways away from that total throughput, as I'm currently running F30@64M on 64 cores, wasting perhaps half the max. achievable FLOPS. But expect to better that soon.

Do you have any watts-at-wall numbers for your system, idle and under load? All my wattmeters are currently hooked up to GPU-hosting systems which I don't want to unplug.

Last fiddled with by ewmayer on 2020-11-30 at 20:01
ewmayer is offline   Reply With Quote