mersenneforum.org AVX512 hardware recommendations?
 Register FAQ Search Today's Posts Mark Forums Read

2020-05-25, 22:04   #12
kriesel

"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22·32·127 Posts

Quote:
 Originally Posted by ewmayer Found a "like new" system matching Ken's specs above for $255 and free-shipping on Amazon - ordered, will likely ditch the Win10 install for a clean Ubuntu 19.10 one, or perhaps co-install the latter. It will be interesting to compare the throughput with that of an AVX2 build on my venerable dual-core Broadwell NUC, which is ~1/2" lower-profile due it using an M2 module versus the SSD on the just-ordered one. (But a 1TB SSD is a nice chunk of new storage - hell, that is worth over$100 by itself.) Question: Does it make sense to use the Radeon 540 GPU on these for either TF or LL/PRP testing? Getting some decent GIMPS work from that would be a nice bonus.
The one I posted about has a 2.5" spinning HDD TB not SSD.
https://www.techpowerup.com/gpu-specs/radeon-540.c3419 gives some idea of gpu specs. Slower than a GTX480, and much less power draw.

 2020-05-26, 02:54 #13 PhilF     Feb 2005 Colorado 523 Posts I don't know if this is interesting to you guys or not, but I have occasionally been able to get Skylake CPUs on colab on an unpaid account. I'm pretty sure it supports AVX512.
2020-05-26, 18:49   #14
ewmayer
2ω=0

Sep 2002
República de California

9,791 Posts

Quote:
 Originally Posted by PhilF I don't know if this is interesting to you guys or not, but I have occasionally been able to get Skylake CPUs on colab on an unpaid account. I'm pretty sure it supports AVX512.
Thanks - that would've possibly been interesting to me until yesterday, but now I have a cute li'l NUC on the way, no account/internet hassle needed. :)

In retrospect, realizing that I have a need for an AVX512 build system now that the GIMPS KNL is gone, I should've considered getting a low-end Skylake-X cpu/mobo for my new multi-GPU build. Haven't tried to cost things out in detail, but I suspect the $275 (incl. tax) I paid for the NUC could've instead paid for an upgrade of my bargain dual-core Celeron cpu/mobo ($120) to a Skylake-X quad - perhaps someone can tell me if $400 can get one such a quad cpu/mobo bundle. But, water under the bridge. And it'll be really interesting to head-to-head compare the throughput and wattage of the new NUC to my old (and still crunching steadily away) AVX2 NUC which Mike/Xyzzy built for me way back when. BTW, I linked to the system I found because the seller apparently has multiple ones for sale, in case anyone else wants one. Last fiddled with by ewmayer on 2020-05-27 at 19:30 2020-05-26, 19:20 #15 kriesel "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 22·32·127 Posts Quote:  Originally Posted by PhilF I don't know if this is interesting to you guys or not, but I have occasionally been able to get Skylake CPUs on colab on an unpaid account. I'm pretty sure it supports AVX512. This is a fine idea for those who would like to try before buying, and can afford to wait for one to show up. Those who don't have the time or inclination to hit the cpu or gpu roulette jackpot and need reliable access of long duration go shopping. 2020-05-26, 23:25 #16 mackerel Feb 2016 UK 38910 Posts Quote:  Originally Posted by ewmayer In retrospect, realizing that I have a need for an AVX512 build system now that the GIMPS KNL is gone, I should've considered getting a low-end Skylake-X cpu/mobo for my new multi-GPU build. Haven't tried to cost things out in detail, but I suspect the$275 (incl. tax) I paid for the NUC could've instead paid for an upgrade of my bargain dual-core Celeron cpu/mobo ($120) to a Skylake-X quad - perhaps someone can tell me if$400 can get one such a quad cpu/mobo bundle.
I'm not aware of a quad core Skylake-X CPU. The range started with the 6 core 7800X, which I used to have. At the time it was current, they were often found cheaper than the consumer 6 core 8700k (not considering mobo cost). The quad core parts on X299 platform were Kaby Lake-X and did not have AVX-512. As for pricing, that'll depend on your local used market. The mobos aren't exactly cheap either.

2020-05-27, 11:56   #17
Xyzzy

"Mike"
Aug 2002

7,699 Posts

Quote:
 Originally Posted by ewmayer …now that the GIMPS KNL is gone…
Gone?

2020-05-27, 14:06   #18
kriesel

"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22·32·127 Posts

Quote:
 Originally Posted by Xyzzy Gone?
KNL as a whole is gone in a sense;
"Intel announced they were discontinuing Knights Landing in summer 2018."
https://en.wikipedia.org/wiki/Xeon_Phi#Knights_Landing

2020-05-27, 19:10   #19
ewmayer
2ω=0

Sep 2002
República de California

230778 Posts

Quote:
 Originally Posted by mackerel I'm not aware of a quad core Skylake-X CPU. The range started with the 6 core 7800X, which I used to have. At the time it was current, they were often found cheaper than the consumer 6 core 8700k (not considering mobo cost). The quad core parts on X299 platform were Kaby Lake-X and did not have AVX-512. As for pricing, that'll depend on your local used market. The mobos aren't exactly cheap either.
Ah, thanks - cheapest I found on eBay (without looking terribly hard) used hex-core matching your model was $420, gotta figure a couple hundred$ more for mobo+memory. So the NUC is definitely the cheapest way to get AVX-512, and may even prove to be not-entirely uncompetitive with the Big-Boy-rigs on a \$/FLOP basis. With just 2 cores there seems to be less risk of idling due to data-bus contention. I plan to just focus on builds of my own code, but Ken can post Prime95 numbers on his NUC once he gets it, for comparison to the hex-core systems.

Quote:
 Originally Posted by Xyzzy Gone?
David Stanfill, who kindly volunteered to physically host the GIMPS crowdfunded KNL, went AWOL last year, and my multiple attempts to reach him via various routes proved futile. Thankfully, Ryan Propper offered to step in and resume the pair of F30 Pepin tests I had running on DS' gear (one @64M FFT on the KNL using an AVX-512 build of Mlucas, one @60M FFT on a 32-core Xeon server he had using an AVX2 build. Feel free to try to locate David, anyone - the KNL belongs to GIMPS, it would be nice to have it available for the purpose for which GIMPS members bought it.

Last fiddled with by ewmayer on 2020-05-27 at 19:11

2020-05-28, 09:38   #20
ldesnogu

Jan 2008
France

10000100002 Posts

Quote:
 Originally Posted by ewmayer With just 2 cores there seems to be less risk of idling due to data-bus contention.
Only two cores and half the FMA if I remember correctly

2020-05-31, 16:56   #21
kriesel

"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

22·32·127 Posts
i3-8121U and RX540 performance on Win 10 in my NUC

Quote:
 Originally Posted by ewmayer Question: Does it make sense to use the Radeon 540 GPU on these for either TF or LL/PRP testing? Getting some decent GIMPS work from that would be a nice bonus.
The RX540 in my refurb NUC gets ~112GhzD/day in TF, 10.3msec/iter in 56M gpuowl v6.11-292 PRP, while the i3-8121U does 7.8 msec/iter in in prime95 29.8b6. (There was some severe system instability in the first day, but that has ceased, perhaps because the room is a few C less hot). We'll see how it fares when hotter weather returns soon.

An extensive prime95 benchmarking tabulation of the i3-8121u is posted as an attachment of https://www.mersenneforum.org/showpo...19&postcount=5
I'd be very interested in comments from George or Ernst on the variations in the later plots, especially pages 5 and 6 as currently formatted.
All the preceding was obtained with stock configuration (no overclocking, voltage adjustments, etc.)

Last fiddled with by kriesel on 2020-05-31 at 17:17

2020-05-31, 19:50   #22
ewmayer
2ω=0

Sep 2002
República de California

263F16 Posts

Quote:
 Originally Posted by kriesel The RX540 in my refurb NUC gets ~112GhzD/day in TF, 10.3msec/iter in 56M gpuowl v6.11-292 PRP, while the i3-8121U does 7.8 msec/iter in in prime95 29.8b6. (There was some severe system instability in the first day, but that has ceased, perhaps because the room is a few C less hot). We'll see how it fares when hotter weather returns soon.
That seems quite promising in terms of getting useful work out of both CPU and GPU ... did you observe the TDP for these 3 states?

1. System powered up but otherwise idle;
2. Prime95 running in max-throughput configuration;
3. Both Prime95 and gpuowl running in max-throughput configuration.

Re. heat, did you try popping the plastic top panel like I suggested? My i3-NUC has yet to arrive, but if the chassis is similarly designed as my Broadwell NUC, there's a flat sheet-metal panel underneath the plastic cap which can serve as a radiator, and is also a tempting target for affixing a heatsink-possible-with-fan.

Quote:
 An extensive prime95 benchmarking tabulation of the i3-8121u is posted as an attachment of https://www.mersenneforum.org/showpo...19&postcount=5 I'd be very interested in comments from George or Ernst on the variations in the later plots, especially pages 5 and 6 as currently formatted. All the preceding was obtained with stock configuration (no overclocking, voltage adjustments, etc.)
My, you've been busy. :) Not sure what I'm supposed to be seeing in the plots on page 5 and 6 - in 5 you do best-fits using a monomial which gives an x^(-1.08...) best-fit behavior, I would be interested in seeing how that compares to a best-fit of a straight line to the data scaled as (iters/sec)*(n log n), i.e. the expected throughput based on FFT opcount.

The Bits/word plot on page 6, are the '+' data points just reflecting the internal bits/word thresholds of the program, and you found a log-based formula which gives a nice fit to those? That could be useful. Again by way of an alternate fit, I'd be interested in seeing how the data compare to the formula I use in Mlucas, in the code-box below, with Wbits being the output you want. Note that AsympConst is a tunable parameter which moves the entire resulting curve up (allowing larger p at each N) or down ... if e.g. my 0.6 setting gives a curve whose shape matches your data points but which is a bit below them, try 0.7, 0.8, etc. In regression form we'd be wanting the regression to tell us the best-bit value of said parameter.
Code:
uint64 given_N_get_maxP(uint32 N)
{
const double Bmant = 53;
const double AsympConst = 0.6;
const double ln2inv = 1.0/log(2.0);
double ln_N, lnln_N, l2_N, lnl2_N, l2l2_N, lnlnln_N, l2lnln_N;
double Wbits, maxExp2;

ln_N     = log(1.0*N);
lnln_N   = log(ln_N);
l2_N     = ln2inv*ln_N;
lnl2_N   = log(l2_N);
l2l2_N   = ln2inv*lnl2_N;
lnlnln_N = log(lnln_N);
l2lnln_N = ln2inv*lnlnln_N;

Wbits = 0.5*( Bmant - AsympConst - 0.5*(l2_N + l2l2_N) - 1.5*(l2lnln_N) );
maxExp2 = Wbits*N;
/*
fprintf(stderr,"N = %8u K  maxP = %10u\n", N>>10, (uint32)maxExp2);
*/
return (uint64)maxExp2;
}
Sounds like you're having fun, in your own distinctive Krieselian "data ... must have ... more data" fashion. :)

Last fiddled with by ewmayer on 2020-05-31 at 19:51

 Similar Threads Thread Thread Starter Forum Replies Last Post heliosh Hardware 19 2020-01-18 04:01 simon389 Software 20 2018-12-13 21:01 Mr. Odd Hardware 7 2016-06-02 01:07 ixfd64 Hardware 45 2012-11-14 01:19 Mr. Odd Factoring 12 2011-11-19 00:32

All times are UTC. The time now is 23:55.

Tue Oct 20 23:55:49 UTC 2020 up 40 days, 21:06, 0 users, load averages: 1.97, 1.93, 1.92