mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2013-10-02, 13:32   #221
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

72×197 Posts
Default

Second clLucas test finished with success. I think this is a wonder, and either cllucas is more stable than cudalucas, or (and it hurts me to say it! ) amd cards are more stable than nvidia cards. I say that because the computer suffered ove 100 restarts and blue-screens in this period (not related to cl-stuff nor video card, but to other chain of things this computer is part now). Honestly I expected a mismatch, but I let it run...

I will dare now to start the third... (@kracker )
LaurV is offline   Reply With Quote
Old 2013-10-06, 12:22   #222
nucleon
 
nucleon's Avatar
 
Mar 2003
Melbourne

20316 Posts
Default

I have 7990 (375W) 2xGPUs @1000MHz clock/1500MHz mem.

Using M30583963 as a test with FFT=2M I get iteration times around 3.9ms on both GPUs.

GPU0
Code:
Iteration 90000 M( 30583963 )C, 0x1bbfddfc1ddbe19f, n = 2097152, clLucas v1.01 err = 0.001099 (0:39 real, 3.8968 ms/iter, ETA 33:00:13)
Iteration 100000 M( 30583963 )C, 0xde1a5280cc5ab5cc, n = 2097152, clLucas v1.01 err = 0.001099 (0:39 real, 3.8989 ms/iter, ETA 33:00:39)
Iteration 110000 M( 30583963 )C, 0x5cfaabf2bb0e0487, n = 2097152, clLucas v1.01 err = 0.001099 (0:39 real, 3.8959 ms/iter, ETA 32:58:27)
GPU1
Code:
Iteration 90000 M( 30583963 )C, 0x1bbfddfc1ddbe19f, n = 2097152, clLucas v1.01 err = 0.0009766 (0:39 real, 3.8872 ms/iter, ETA 32:55:20)
Iteration 100000 M( 30583963 )C, 0xde1a5280cc5ab5cc, n = 2097152, clLucas v1.01 err = 0.0009766 (0:39 real, 3.8944 ms/iter, ETA 32:58:20)
Iteration 110000 M( 30583963 )C, 0x5cfaabf2bb0e0487, n = 2097152, clLucas v1.01 err = 0.0009766 (0:39 real, 3.8989 ms/iter, ETA 32:59:58)
Contrasting to my titan I get:
Code:
Iteration 470000 M( 30583963 )C, 0x64208421d1c227ee, n = 1835008, CUDALucas v2.03 err = 0.0283 (0:17 real, 1.7324 ms/iter, ETA 14:29:22)
Iteration 480000 M( 30583963 )C, 0x29d2686b8bb60915, n = 1835008, CUDALucas v2.03 err = 0.0283 (0:18 real, 1.7334 ms/iter, ETA 14:29:36)
Iteration 490000 M( 30583963 )C, 0x3308faaa69d0eef7, n = 1835008, CUDALucas v2.03 err = 0.0283 (0:17 real, 1.7254 ms/iter, ETA 14:25:18)
-- Craig
nucleon is offline   Reply With Quote
Old 2013-10-06, 12:49   #223
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

72·197 Posts
Default

That is "normal". Your Titan uses a much shorter (therefore faster) FFT for this exponent. That shorter FFT is not optimized for cl-FFT. People are still working to convince clLucas to deal with non-powers-of-two FFT size (think when cudaLucas switched to v1.48 to 1.69 than later to 2.0, last year). Therefore, 1M9 FFT is much slower now, for clLucas, the same test would take 80 hours on your card. Therefore the 2M09 FFT is used, which is power of two. Try testing an exponent on your Titan which uses a comparable FFT size (ex: 38M exponent). Then, the comparison will be more accurate. Therefore posts like this.

(edit: by the way, my 580 is about 3% faster with 2097152, comparing to 1835008, you can try it for your Titan, an may get a speedup even for that 30M expo you are using for testing)

Last fiddled with by LaurV on 2013-10-06 at 13:01
LaurV is offline   Reply With Quote
Old 2013-10-06, 16:40   #224
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23·271 Posts
Default

Quote:
Originally Posted by nucleon View Post
I have 7990 (375W) 2xGPUs @1000MHz clock/1500MHz mem.

Using M30583963 as a test with FFT=2M I get iteration times around 3.9ms on both GPUs.

GPU0
Code:
Iteration 90000 M( 30583963 )C, 0x1bbfddfc1ddbe19f, n = 2097152, clLucas v1.01 err = 0.001099 (0:39 real, 3.8968 ms/iter, ETA 33:00:13)
Iteration 100000 M( 30583963 )C, 0xde1a5280cc5ab5cc, n = 2097152, clLucas v1.01 err = 0.001099 (0:39 real, 3.8989 ms/iter, ETA 33:00:39)
Iteration 110000 M( 30583963 )C, 0x5cfaabf2bb0e0487, n = 2097152, clLucas v1.01 err = 0.001099 (0:39 real, 3.8959 ms/iter, ETA 32:58:27)
GPU1
Code:
Iteration 90000 M( 30583963 )C, 0x1bbfddfc1ddbe19f, n = 2097152, clLucas v1.01 err = 0.0009766 (0:39 real, 3.8872 ms/iter, ETA 32:55:20)
Iteration 100000 M( 30583963 )C, 0xde1a5280cc5ab5cc, n = 2097152, clLucas v1.01 err = 0.0009766 (0:39 real, 3.8944 ms/iter, ETA 32:58:20)
Iteration 110000 M( 30583963 )C, 0x5cfaabf2bb0e0487, n = 2097152, clLucas v1.01 err = 0.0009766 (0:39 real, 3.8989 ms/iter, ETA 32:59:58)
Contrasting to my titan I get:
Code:
Iteration 470000 M( 30583963 )C, 0x64208421d1c227ee, n = 1835008, CUDALucas v2.03 err = 0.0283 (0:17 real, 1.7324 ms/iter, ETA 14:29:22)
Iteration 480000 M( 30583963 )C, 0x29d2686b8bb60915, n = 1835008, CUDALucas v2.03 err = 0.0283 (0:18 real, 1.7334 ms/iter, ETA 14:29:36)
Iteration 490000 M( 30583963 )C, 0x3308faaa69d0eef7, n = 1835008, CUDALucas v2.03 err = 0.0283 (0:17 real, 1.7254 ms/iter, ETA 14:25:18)
-- Craig
Try using a greater -c and -aggressive(if not currently) and see if that changes anything.

@LaurV: I'll ask again, try pushing your memory clock up!
kracker is offline   Reply With Quote
Old 2013-10-07, 01:04   #225
nucleon
 
nucleon's Avatar
 
Mar 2003
Melbourne

10000000112 Posts
Default

Aaah, the power of 2 FFT issue.

Ok, next time I get a chance to experiment I'll play with something more suitable.

-- Craig
nucleon is offline   Reply With Quote
Old 2013-10-08, 15:07   #226
Robish
 
"Rob Gahan"
Aug 2013
Ireland

1001002 Posts
Smile

Quote:
Originally Posted by kracker View Post
4 DC's finished here

Code:
M( 30766511 )C, 0x1ff14c8237b5e935, n = 2097152, clLucas v1.00
M( 30822937 )C, 0x1c656da41a256c21, n = 2097152, clLucas v1.01
M( 30888499 )C, 0xc296d9ac47d90339, n = 2097152, clLucas v1.01
M( 30976273 )C, 0x6c0367ea40d74647, n = 2097152, clLucas v1.01
My first CLLucas ;-)

M( 58191149 )C, 0x9108992abb23c5d1, n = 4194304, clLucas v1.01

10.5 days on 7870 with aggressive

More on the way....
Robish is offline   Reply With Quote
Old 2013-10-08, 15:39   #227
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

216810 Posts
Default

Quote:
Originally Posted by Robish View Post
My first CLLucas ;-)

M( 58191149 )C, 0x9108992abb23c5d1, n = 4194304, clLucas v1.01

10.5 days on 7870 with aggressive

More on the way....
Nice. I may put that on Prime95 for DCing, just for fun...
kracker is offline   Reply With Quote
Old 2013-10-08, 16:53   #228
Robish
 
"Rob Gahan"
Aug 2013
Ireland

1001002 Posts
Smile

Quote:
Originally Posted by kracker View Post
Nice. I may put that on Prime95 for DCing, just for fun...
Cool,

Next due in 14hrs, one in 24hrs and one in 36hrs.

Just bigger numbers take longer..... :-)

On a different subject, has many of you tried 100 million attempts?

Trying with Cudalucas at the mo.

1st attempt was reading 4500 hrs (190 days) on 20971520 (recommended by someone ;-( but after following this thread I tried 2097152 and (guess what) it dropped to 19 days on a gtx 690.

Should have results in two days time...

Just wondering if many were trying them yet?

Cheers

Rob.
Robish is offline   Reply With Quote
Old 2013-10-08, 17:52   #229
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

7,537 Posts
Default

Quote:
Originally Posted by Robish View Post
On a different subject, has many of you tried 100 million attempts?

1st attempt was reading 4500 hrs (190 days) on 20971520 (recommended by someone ;-( but after following this thread I tried 2097152 and (guess what) it dropped to 19 days on a gtx 690.

Should have results in two days time...
You cannot test a 100M digit number with an FFT length of 2 million. You may as well throw that result away. You will need an FFT length of at least 18 million.
Prime95 is offline   Reply With Quote
Old 2013-10-08, 18:33   #230
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

87816 Posts
Default

Quote:
Originally Posted by Prime95 View Post
You cannot test a 100M digit number with an FFT length of 2 million. You may as well throw that result away. You will need an FFT length of at least 18 million.
I'm suprised CULu allowed it?
kracker is offline   Reply With Quote
Old 2013-10-08, 19:46   #231
Robish
 
"Rob Gahan"
Aug 2013
Ireland

22×32 Posts
Unhappy

Quote:
Originally Posted by Prime95 View Post
You cannot test a 100M digit number with an FFT length of 2 million. You may as well throw that result away. You will need an FFT length of at least 18 million.

Really? Can you explain why? or a link to somewhere/thread that covers this please? It all looks like its going fine at the moment anyway......??
Robish is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
Can't get OpenCL to work on HD7950 Ubuntu 14.04.5 LTS VictordeHolland Linux 4 2018-04-11 13:44
OpenCL accellerated lattice siever pstach Factoring 1 2014-05-23 01:03
OpenCL for FPGAs TObject GPU Computing 2 2013-10-12 21:09
AMD's Graphics Core Next- a reason to accelerate towards OpenCL? Belteshazzar GPU Computing 19 2012-03-07 18:58

All times are UTC. The time now is 07:07.


Mon Aug 2 07:07:08 UTC 2021 up 10 days, 1:36, 0 users, load averages: 1.90, 1.95, 1.60

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.