mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   LL with OpenCL (https://www.mersenneforum.org/showthread.php?t=18297)

LaurV 2013-10-02 13:32

Second clLucas test [URL="http://www.mersenne.org/report_exponent/?exp_lo=30415969&exp_hi=&B1=Get+status"]finished with success[/URL]. I think this is a wonder, and either cllucas is more stable than cudalucas, or (and it hurts me to say it! :smile:) amd cards are more stable than nvidia cards. I say that because the computer suffered ove 100 restarts and blue-screens in this period (not related to cl-stuff nor video card, but to other chain of things this computer is part now). Honestly I expected a mismatch, but I let it run...

I will dare now to start the third... (@kracker :yucky:)

nucleon 2013-10-06 12:22

I have 7990 (375W) 2xGPUs @1000MHz clock/1500MHz mem.

Using M30583963 as a test with FFT=2M I get iteration times around 3.9ms on both GPUs.

GPU0
[CODE]Iteration 90000 M( 30583963 )C, 0x1bbfddfc1ddbe19f, n = 2097152, clLucas v1.01 err = 0.001099 (0:39 real, 3.8968 ms/iter, ETA 33:00:13)
Iteration 100000 M( 30583963 )C, 0xde1a5280cc5ab5cc, n = 2097152, clLucas v1.01 err = 0.001099 (0:39 real, 3.8989 ms/iter, ETA 33:00:39)
Iteration 110000 M( 30583963 )C, 0x5cfaabf2bb0e0487, n = 2097152, clLucas v1.01 err = 0.001099 (0:39 real, 3.8959 ms/iter, ETA 32:58:27)
[/CODE]
GPU1
[CODE]Iteration 90000 M( 30583963 )C, 0x1bbfddfc1ddbe19f, n = 2097152, clLucas v1.01 err = 0.0009766 (0:39 real, 3.8872 ms/iter, ETA 32:55:20)
Iteration 100000 M( 30583963 )C, 0xde1a5280cc5ab5cc, n = 2097152, clLucas v1.01 err = 0.0009766 (0:39 real, 3.8944 ms/iter, ETA 32:58:20)
Iteration 110000 M( 30583963 )C, 0x5cfaabf2bb0e0487, n = 2097152, clLucas v1.01 err = 0.0009766 (0:39 real, 3.8989 ms/iter, ETA 32:59:58)
[/CODE]
Contrasting to my titan I get:
[CODE]Iteration 470000 M( 30583963 )C, 0x64208421d1c227ee, n = 1835008, CUDALucas v2.03 err = 0.0283 (0:17 real, 1.7324 ms/iter, ETA 14:29:22)
Iteration 480000 M( 30583963 )C, 0x29d2686b8bb60915, n = 1835008, CUDALucas v2.03 err = 0.0283 (0:18 real, 1.7334 ms/iter, ETA 14:29:36)
Iteration 490000 M( 30583963 )C, 0x3308faaa69d0eef7, n = 1835008, CUDALucas v2.03 err = 0.0283 (0:17 real, 1.7254 ms/iter, ETA 14:25:18)
[/CODE]

-- Craig

LaurV 2013-10-06 12:49

That is "normal". Your Titan uses a much shorter (therefore faster) FFT for this exponent. That shorter FFT is not optimized for cl-FFT. People are still working to convince clLucas to deal with non-powers-of-two FFT size (think when cudaLucas switched to v1.48 to 1.69 than later to 2.0, last year). Therefore, 1M9 FFT is much slower now, for clLucas, the same test would take 80 hours on your card. Therefore the 2M09 FFT is used, which is power of two. Try testing an exponent on your Titan which uses a comparable FFT size (ex: 38M exponent). Then, the comparison will be more accurate. Therefore [URL="http://www.mersenneforum.org/showthread.php?p=355232#post355232"]posts like this[/URL].

(edit: by the way, my 580 is about 3% faster with 2097152, comparing to 1835008, you can try it for your Titan, an may get a speedup even for that 30M expo you are using for testing)

kracker 2013-10-06 16:40

[QUOTE=nucleon;355405]I have 7990 (375W) 2xGPUs @1000MHz clock/1500MHz mem.

Using M30583963 as a test with FFT=2M I get iteration times around 3.9ms on both GPUs.

GPU0
[CODE]Iteration 90000 M( 30583963 )C, 0x1bbfddfc1ddbe19f, n = 2097152, clLucas v1.01 err = 0.001099 (0:39 real, 3.8968 ms/iter, ETA 33:00:13)
Iteration 100000 M( 30583963 )C, 0xde1a5280cc5ab5cc, n = 2097152, clLucas v1.01 err = 0.001099 (0:39 real, 3.8989 ms/iter, ETA 33:00:39)
Iteration 110000 M( 30583963 )C, 0x5cfaabf2bb0e0487, n = 2097152, clLucas v1.01 err = 0.001099 (0:39 real, 3.8959 ms/iter, ETA 32:58:27)
[/CODE]GPU1
[CODE]Iteration 90000 M( 30583963 )C, 0x1bbfddfc1ddbe19f, n = 2097152, clLucas v1.01 err = 0.0009766 (0:39 real, 3.8872 ms/iter, ETA 32:55:20)
Iteration 100000 M( 30583963 )C, 0xde1a5280cc5ab5cc, n = 2097152, clLucas v1.01 err = 0.0009766 (0:39 real, 3.8944 ms/iter, ETA 32:58:20)
Iteration 110000 M( 30583963 )C, 0x5cfaabf2bb0e0487, n = 2097152, clLucas v1.01 err = 0.0009766 (0:39 real, 3.8989 ms/iter, ETA 32:59:58)
[/CODE]Contrasting to my titan I get:
[CODE]Iteration 470000 M( 30583963 )C, 0x64208421d1c227ee, n = 1835008, CUDALucas v2.03 err = 0.0283 (0:17 real, 1.7324 ms/iter, ETA 14:29:22)
Iteration 480000 M( 30583963 )C, 0x29d2686b8bb60915, n = 1835008, CUDALucas v2.03 err = 0.0283 (0:18 real, 1.7334 ms/iter, ETA 14:29:36)
Iteration 490000 M( 30583963 )C, 0x3308faaa69d0eef7, n = 1835008, CUDALucas v2.03 err = 0.0283 (0:17 real, 1.7254 ms/iter, ETA 14:25:18)
[/CODE]-- Craig[/QUOTE]

Try using a greater -c and -aggressive(if not currently) and see if that changes anything.

@LaurV: I'll ask again, try pushing your memory clock up! :razz:

nucleon 2013-10-07 01:04

Aaah, the power of 2 FFT issue.

Ok, next time I get a chance to experiment I'll play with something more suitable.

-- Craig

Robish 2013-10-08 15:07

[QUOTE=kracker;354404]4 DC's finished here :smile:

[code]
M( [URL="http://mersenne.org/report_exponent/?exp_lo=30766511&exp_hi=&B1=Get+status"]30766511[/URL] )C, 0x1ff14c8237b5e935, n = 2097152, clLucas v1.00
M( [URL="http://mersenne.org/report_exponent/?exp_lo=30822937&exp_hi=&B1=Get+status"]30822937[/URL] )C, 0x1c656da41a256c21, n = 2097152, clLucas v1.01
M( [URL="http://mersenne.org/report_exponent/?exp_lo=30888499&exp_hi=&B1=Get+status"]30888499[/URL] )C, 0xc296d9ac47d90339, n = 2097152, clLucas v1.01
M( [URL="http://mersenne.org/report_exponent/?exp_lo=30976273&exp_hi=&B1=Get+status"]30976273[/URL] )C, 0x6c0367ea40d74647, n = 2097152, clLucas v1.01
[/code][/QUOTE]

My first CLLucas ;-)

M( 58191149 )C, 0x9108992abb23c5d1, n = 4194304, clLucas v1.01

10.5 days on 7870 with aggressive

More on the way....

kracker 2013-10-08 15:39

[QUOTE=Robish;355611]My first CLLucas ;-)

M( 58191149 )C, 0x9108992abb23c5d1, n = 4194304, clLucas v1.01

10.5 days on 7870 with aggressive

More on the way....[/QUOTE]

Nice. :smile: I may put that on Prime95 for DCing, just for fun...

Robish 2013-10-08 16:53

[QUOTE=kracker;355613]Nice. :smile: I may put that on Prime95 for DCing, just for fun...[/QUOTE]

Cool,

Next due in 14hrs, one in 24hrs and one in 36hrs.

Just bigger numbers take longer..... :-)

On a different subject, has many of you tried 100 million attempts?

Trying with Cudalucas at the mo.

1st attempt was reading 4500 hrs (190 days) on 20971520 (recommended by someone ;-( but after following this thread I tried 2097152 and (guess what) it dropped to 19 days on a gtx 690.

Should have results in two days time...

Just wondering if many were trying them yet?

Cheers

Rob.

Prime95 2013-10-08 17:52

[QUOTE=Robish;355619]
On a different subject, has many of you tried 100 million attempts?

1st attempt was reading 4500 hrs (190 days) on 20971520 (recommended by someone ;-( but after following this thread I tried 2097152 and (guess what) it dropped to 19 days on a gtx 690.

Should have results in two days time...[/QUOTE]

You cannot test a 100M digit number with an FFT length of 2 million. You may as well throw that result away. You will need an FFT length of at least 18 million.

kracker 2013-10-08 18:33

[QUOTE=Prime95;355620]You cannot test a 100M digit number with an FFT length of 2 million. You may as well throw that result away. You will need an FFT length of at least 18 million.[/QUOTE]
I'm suprised CULu allowed it?

Robish 2013-10-08 19:46

[QUOTE=Prime95;355620]You cannot test a 100M digit number with an FFT length of 2 million. You may as well throw that result away. You will need an FFT length of at least 18 million.[/QUOTE]


Really? Can you explain why? or a link to somewhere/thread that covers this please? It all looks like its going fine at the moment anyway......??


All times are UTC. The time now is 22:30.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.