mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Hardware (https://www.mersenneforum.org/forumdisplay.php?f=9)
-   -   AlderLake anyone? (https://www.mersenneforum.org/showthread.php?t=27112)

Prime95 2021-08-31 00:53

AlderLake anyone?
 
Does anyone here have access to an Alder Lake system?

I know prime95 will have some difficulties with the big.little architecture. I'd like to code up some upgrades and have someone test them. If under NDA, I think we can discuss bugs without discussing the architecture specifics.

JWNoctis 2021-08-31 07:27

Off topic, but...it's been a while, and I'm kind of surprised that nobody here had brought up the fact that, for whatever reason, Alder Lake apparently won't have AVX-512 support in consumer/client-level processors, including the entire Core lineup.


Which to my understanding would have a non-trivial impact on performance for Prime95, for those who still intend to crunch on their personal machine and planning to buy one of those.



It would be interesting to see some comparison for sure.

elmor 2021-09-03 06:47

Hello, I can assist you with testing. You can contact me on the email address I registered with.

kriesel 2021-09-03 12:23

[QUOTE=JWNoctis;586909]for whatever reason, Alder Lake apparently won't have AVX-512 support in consumer/client-level processors, including the entire Core lineup.

Which to my understanding would have a non-trivial impact on performance for Prime95[/QUOTE]It also means that for mprime / prime95, the top of the mersenne.org exponent space will be out of reach. AVX2/FMA3 fft lengths implemented max out at 920.8M.
[url]https://en.wikipedia.org/wiki/Alder_Lake_(microprocessor)[/url]

Zhangrc 2021-09-03 12:53

[QUOTE=kriesel;587147]
[url]https://en.wikipedia.org/wiki/Alder_Lake_(microprocessor[/url])[/QUOTE]

Off topic, but the right bracket outside the URL tag causes confusion.

You always tell people to check spelling and grammar for every post.

But we must accept that even the best of us make typos and mistakes.

The link should be: (A futile attempt to make this post more relevant :sad:)

[url]https://en.wikipedia.org/wiki/Alder_Lake_(microprocessor)[/url]

axn 2021-09-03 13:52

[QUOTE=kriesel;587147]AVX2/FMA3 fft lengths implemented max out at 920.8M. [/QUOTE]

It is trivial to extend FFT lengths. But only masochists and morons would test these high exponents with today's technology.

kruoli 2021-09-03 13:55

[QUOTE=Zhangrc;587148]The link should be: (A futile attempt to make this post more relevant :sad:)

[url]https://en.wikipedia.org/wiki/Alder_Lake_%28microprocessor%29[/url][/QUOTE]

Here you go. :smile: Replace brackets with their URL escape equivalent.

kriesel 2021-09-03 14:11

[QUOTE=Zhangrc;587148]You [STRIKE]always[/STRIKE] tell people to check spelling and grammar
[/QUOTE]My bad, I thought a copy/paste of a URL that had already worked for me would be no problem, but apparently it got clobbered.
[QUOTE=axn;587153]It is trivial to extend FFT lengths.[/QUOTE]Then it should be easy for you to help resolve the known server issue of being unable to handle in its SSE2 instruction set, PRP proof files for exponents >595M requiring longer than 32M fft length. Such exponents can be fully PRP tested in ~ 7 weeks (to ~ 4-5 months for ~1G) on Radeon VIIs including proof file generation, for various test/QA purposes, and benchmarked much quicker for checking run time scaling empirically.

axn 2021-09-03 14:55

[QUOTE=kriesel;587157]Then it should be easy for you [/quote]
Me? No. George, yes. But last I checked he's neither a masochist nor a moron.

[QUOTE=kriesel;587157]PRP proof files for exponents >595M requiring longer than 32M fft length. Such exponents can be fully PRP tested in ~ 7 weeks (to ~ 4-5 months for ~1G) on Radeon VIIs including proof file generation, for various test/QA purposes, and benchmarked much quicker for checking run time scaling empirically.[/QUOTE]

None of these are valid activities in 2021 (IMO). Maybe in 2041, perhaps?

kriesel 2021-09-03 15:36

[QUOTE=axn;587158]George, yes. But last I checked he's neither a masochist nor a moron.[/QUOTE]I don't believe Ernst is either, yet ffts up to 512M are now available in Mlucas. (Mainly I think due to Ernst's plan to test F33.) Nor Mihai either, yet ffts up to 120M length and sometimes higher have been available in gpuowl for years. For that matter, CUDALucas had 128M fft length several years ago.

I test while I still can. I'm of an age that, assuming I'm still above the sod then, abilities may have declined considerably in another 20 years. Such testing has already identified the server's current limitation, and some client software issues. Production running is typically wavefronts only. Experimenting with testing widely keeps it interesting. Finding issues early gives George et al maximum time to perhaps address them, before they become an issue at the wavefront, and before their abilities decline due to aging, or whatever calamity might befall them.[QUOTE]None of these are valid activities in 2021 (IMO). Maybe in 2041, perhaps?[/QUOTE]But we digress. Looking forward to seeing what Alder Lake can bring to the party.
Welcome to the forum Elmor.

mackerel 2021-09-05 08:21

[QUOTE=JWNoctis;586909]Off topic, but...it's been a while, and I'm kind of surprised that nobody here had brought up the fact that, for whatever reason, Alder Lake apparently won't have AVX-512 support in consumer/client-level processors, including the entire Core lineup.

Which to my understanding would have a non-trivial impact on performance for Prime95, for those who still intend to crunch on their personal machine and planning to buy one of those.[/QUOTE]

A possible explanation I've seen is that both cores need to share a common feature set. As threads get moved around you can't end up with a situation where code needs AVX-512, but it can't be provided by the E cores. The solution for now is to disable AVX-512 where you have E cores present. Unfortunately they have also said you can't get AVX-512 back by disable E cores and run only P cores. I had hoped something smarter could be done at OS level, where for example code needing AVX-512 would only be run on P cores and not touch the E cores at all.

For consumer level CPUs, for bigger FFTs were way into the region where ram bandwidth is limiting. In that scenario, the loss of AVX-512 is less of an impact, and having faster DDR5 speeds would probably result in a net increase to performance.

Where the loss of AVX-512 will be felt more is for those like me who focus on smaller FFTs that are not limited by ram speeds. The open question remains, just what sort of FMA performance can the E cores provide? It might be a case of quantity over quality to provide a possible net positive in performance.


All times are UTC. The time now is 00:07.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.