mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GpuOwl (https://www.mersenneforum.org/forumdisplay.php?f=171)
-   -   gpuOwL: an OpenCL program for Mersenne primality testing (https://www.mersenneforum.org/showthread.php?t=22204)

preda 2020-03-17 10:01

[QUOTE=Prime95;539882]Anyone get rocm 3.1 to work on Ubuntu 19.04? I've tried 3 times without success.[/QUOTE]

I would expect Ubuntu 19.04 to be pretty similar to 19.10 from ROCm POV (more important would be the kernel version). What step is falling?

I mentioned here [url]https://github.com/RadeonOpenCompute/ROCm/issues/977[/url] that I had to install libncurses5 too.

Prime95 2020-03-17 10:47

Using not yet committed code:

Rocm 2.10, sclk 4, mem 1200, FFT 5M; 662us/it.
Running 2 instances: 604us/it (200W measured by rocm-smi)

I love this GPU.

axn 2020-03-17 12:45

[QUOTE=Prime95;539912]Using not yet committed code:

Rocm 2.10, sclk 4, mem 1200, FFT 5M; 662us/it.
Running 2 instances: 604us/it (200W measured by rocm-smi)

I love this GPU.[/QUOTE]

How many GHzD/d does that translate to? 400-ish? :shock:

EDIT:- Probably more like high 400, low 500!

Prime95 2020-03-17 16:14

[QUOTE=axn;539927]How many GHzD/d does that translate to? 400-ish? :shock:

EDIT:- Probably more like high 400, low 500![/QUOTE]

510 PRP-GHzD/d

kriesel 2020-03-17 16:35

[QUOTE=Prime95;539955]510 PRP-GHzD/d[/QUOTE]where did you buy that beauty, and whose brand is it? (Just completed the RMA/refund process on my second one.)

Prime95 2020-03-17 16:58

[QUOTE=kriesel;539958]where did you buy that beauty, and whose brand is it? (Just completed the RMA/refund process on my second one.)[/QUOTE]

All GPUs except one range from 602us to 615us. The one outlier is 630us running in I7-860 (not a Sandy Bridge as I originally reported) which is not PCIE 3.0.

Prime95 2020-03-17 17:03

[QUOTE=Prime95;539955]510 PRP-GHzD/d[/QUOTE]

20 GPUs = 1 Curtis Cooper

Uncwilly 2020-03-17 19:37

[QUOTE=Prime95;539960]20 GPUs = 1 Curtis Cooper[/QUOTE]
Is that the next unit like a P90 year?

ewmayer 2020-03-17 19:45

[QUOTE=Prime95;539912]Using not yet committed code:

Rocm 2.10, sclk 4, mem 1200, FFT 5M; 662us/it.
Running 2 instances: 604us/it (200W measured by rocm-smi)

I love this GPU.[/QUOTE]

Nice - how much % gain is that over current commit, and did you also do timings @5632K?

And, have you tried running > 2 instances to see if there is any further marginal throughput gain to be had that way?

[b]Edit:[/b] Just tried the latter experiment - but not using George's uncommitted code, obviously - on my own machine, here the timing/throughput figure for 1-3 workers, all @5632K FFT, sclk = 5:

1: 754 us/iter => 1362 iter/sec
2: 1405 us/iter => 1423 iter/sec
3: 2174 us/iter => 1380 iter/sec

So, deterioration above 2 workers.

PhilF 2020-03-17 21:53

[QUOTE=ewmayer;539973]Nice - how much % gain is that over current commit, and did you also do timings @5632K?

And, have you tried running > 2 instances to see if there is any further marginal throughput gain to be had that way?

[b]Edit:[/b] Just tried the latter experiment - but not using George's uncommitted code, obviously - on my own machine, here the timing/throughput figure for 1-3 workers, all @5632K FFT, sclk = 5:

1: 754 us/iter => 1362 iter/sec
2: 1405 us/iter => 1423 iter/sec
3: 2174 us/iter => 1380 iter/sec

So, deterioration above 2 workers.[/QUOTE]
5632K is the FFT size I can relate to also.

5632K, sclk=5, 1000Mhz memclk, 185W as mesured by rocm, one worker, older version of gpuowl:

872 us/iter

ewmayer 2020-03-17 22:16

[QUOTE=PhilF;539983]5632K is the FFT size I can relate to also.

5632K, sclk=5, 1000Mhz memclk, 185W as mesured by rocm, one worker, older version of gpuowl:

872 us/iter[/QUOTE]

Your mem-downclock is likely the reason you both run slower and at significantly lower power than I, at the same sclk and 1-worker setting. But why not fire up a second worker?


All times are UTC. The time now is 23:10.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.