mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   mfaktc: a CUDA program for Mersenne prefactoring (https://www.mersenneforum.org/showthread.php?t=12827)

James Heinrich 2020-10-20 12:34

Tangential to this subject, I just mention for completeness: the credit given by PrimeNet and the credit displayed on mersenne.ca will likely differ when a factor is found (and bit level not completed), since primenet assumes the factor was found with prime95 (or something using an equivalent number of classes), mersenne.ca assumes the factor was found with mfaktc. On average it all balances, but specific factor "credit" may differ slightly.

axn 2020-10-20 12:39

[QUOTE=James Heinrich;560405]since primenet assumes the factor was found with prime95 [/QUOTE]

Can't it be made to assume otherwise?

James Heinrich 2020-10-20 12:41

[QUOTE=axn;560408]Can't it be made to assume otherwise?[/QUOTE]It likely can, but I'll need to tread carefully with that code. I'll see how possible it is (especially since my previous fix for full-bitrange factors doesn't appear to have worked as intended).

LaurV 2020-10-21 06:52

[QUOTE=kriesel;560404]edit: But interestingly,<link> shows full credit[/QUOTE]
[offtopic]
link is pointless, that's your private page nobody can see, therefore please provide us your password so we can have a look..
[/offtopic]
[edit: that was a joke, don't PM me your password :razz:]

kriesel 2020-10-21 07:59

2 Attachment(s)
Compare other recent 74-75 bit factor-found credits given, to the most recent one, and 75-76 to the proper 67.1 GhzD credit for completing the bit level. Note that had they been performed with StopAfterFactor=2 (finish the class), reduced credit would have been appropriate, and the listings here would include asterisks. These TF were all done with StopAfterFactor=1 (finish the bit level), so no asterisks.
The difference can be over 90% credit loss.
The difference can be rather significant on 80-86 bit final level on large exponents.

Neutron3529 2020-10-21 08:26

1 Attachment(s)
[QUOTE=James Heinrich;558009]A NF result is probably what I'm looking for. Ideally I'd want to know the clockspeed the GPU was running at during the run as well, but a completed NF run is a good start.
[URL]https://www.mersenne.ca/mfaktc.php#benchmark[/URL][/QUOTE]
I bought a RTX 3090.
Here's my results
[ATTACH]23580[/ATTACH]
(only the last result is uploaded)

I got ~5500 GHz-d/day rather than ~5200
The exactly GPU I bought is `GeForce RTX 3090 VENTUS 3X 24G OC`, which could easily reach fan speed 67% and temperature 75C (with P2 348W / 350W)
a normal gpu should be ~70C. Thus I do not recomment buying that GPU even it is faster.
will test gpuowl after current progress finished.

Neutron3529 2020-10-21 10:01

[QUOTE=moebius;558434]Please make a short gpuowl benchmark with the exponent 77936867, so that we can directly compare the values ​​of the graphics cards, thank you.
[URL="https://mersenneforum.org/showthread.php?p=558317#post558317"]https://mersenneforum.org/showthread.php?p=558317#post558317
[/URL][/QUOTE]
you're welcome.

I post the first 300k iters.
A strange thing is that my GPU does not reach 350W power limit(but reach 1965Mhz which is ~200 Mhz higher than it is in mfaktc.)

Viliam Furik 2020-10-21 12:07

[QUOTE=Neutron3529;560502]I bought a RTX 3090.
Here's my results
[ATTACH]23580[/ATTACH]
(only the last result is uploaded)

I got ~5500 GHz-d/day rather than ~5200
The exactly GPU I bought is `GeForce RTX 3090 VENTUS 3X 24G OC`, which could easily reach fan speed 67% and temperature 75C (with P2 348W / 350W)
a normal gpu should be ~70C. Thus I do not recomment buying that GPU even it is faster.
will test gpuowl after current progress finished.[/QUOTE]

Why so poor result? I get about 4900 GHzD/D with the same work. It should be at least 10000 GHzD/D for the 3090, no? It has more than double the FP32 throughput.

It is most probably one of these two reasons:
1. The shared INT32 and FP32 cores don't play nicely with mfaktc - either incompatible code or the cores not fulfilling their promise
2. Memory bottleneck

Either way, I am not satisfied with the result.

Neutron3529 2020-10-21 13:27

[QUOTE=Viliam Furik;560519]Why so poor result?[/QUOTE]
I tried some machine learning program(mxnet), find it is no different to switch the cuda architecture from sm_80 to sm_86, which should have a 2x boost.


maybe the current cuda implementation does not really works for sm_86, maybe cuda 11.2 would help.

James Heinrich 2020-10-21 14:32

[QUOTE=Neutron3529;560502]I bought a RTX 3090. I got ~5500 GHz-d/day[/QUOTE]I'd be curious to know what kind of combined throughput you get when running two instances of mfaktc simultaneously?

James Heinrich 2020-10-21 15:04

[QUOTE=kriesel;560500]These TF were all done with StopAfterFactor=1 (finish the bit level)[/QUOTE]I've looked at the code again and clearly I'm missing something because I think it should be working as I intended (but clearly it isn't). It's also difficult to test because that section of code will only get processed when a new factor is submitted (my logic works fine in my test environment, but something different is happening on the server). I have added a couple of debug lines that might help me track down the problem, if you see them next time you (collective "you", anyone reading this) submit a factor please email me either a copy-paste or screenshot of the output.


All times are UTC. The time now is 13:00.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.