mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2013-10-17, 18:33   #254
Robish
 
"Rob Gahan"
Aug 2013
Ireland

22·32 Posts
Smile

Quote:
Originally Posted by LaurV View Post
BTW, a supermod could mask the residue two posts above... Just in case...

@Robish: please do not post full residues of un-verified LL tests, some "credit hunters" will be tempted to "verify" them using your residue, and 20 years later we may find out we missed a prime, in case that residue is wrong.

Ok will (not) do ;-)

Last fiddled with by Robish on 2013-10-17 at 18:34
Robish is offline   Reply With Quote
Old 2013-10-18, 15:23   #255
nucleon
 
nucleon's Avatar
 
Mar 2003
Melbourne

5·103 Posts
Default

Quote:
Originally Posted by nucleon View Post
Aaah, the power of 2 FFT issue.

Ok, next time I get a chance to experiment I'll play with something more suitable.

-- Craig
Ok, only a fortnight late. 7990@1GHz/1.5GHz mem clk

GPU0:
Code:
Iteration 10000 M( 38000009 )C, 0xfd9116e3760e4571, n = 2097152, clLucas v1.01 err = 0.1406 (0:39 real, 3.8668 ms/iter, ETA 40:48:19)
Iteration 20000 M( 38000009 )C, 0xd91fe21f272e5099, n = 2097152, clLucas v1.01 err = 0.1406 (0:39 real, 3.8636 ms/iter, ETA 40:45:38)
Iteration 30000 M( 38000009 )C, 0x7ec1302ed5173c26, n = 2097152, clLucas v1.01 err = 0.1406 (0:39 real, 3.8766 ms/iter, ETA 40:53:16)

GPU1:

Code:
Iteration 10000 M( 38000009 )C, 0xfd9116e3760e4571, n = 2097152, clLucas v1.01 err = 0.1406 (0:38 real, 3.8664 ms/iter, ETA 40:48:05)
Iteration 20000 M( 38000009 )C, 0xd91fe21f272e5099, n = 2097152, clLucas v1.01 err = 0.1406 (0:39 real, 3.8693 ms/iter, ETA 40:49:15)
Iteration 30000 M( 38000009 )C, 0x7ec1302ed5173c26, n = 2097152, clLucas v1.01 err = 0.1406 (0:39 real, 3.8728 ms/iter, ETA 40:50:51)
vs Titan:
Code:
Iteration 10000 M( 38000009 )C, 0xfd9116e3760e4571, n = 2097152, CUDALucas v2.03 err = 0.1953 (0:18 real, 1.8289 ms/iter, ETA 19:17:59)
Iteration 20000 M( 38000009 )C, 0xd91fe21f272e5099, n = 2097152, CUDALucas v2.03 err = 0.1953 (0:18 real, 1.8124 ms/iter, ETA 19:07:16)
Iteration 30000 M( 38000009 )C, 0x7ec1302ed5173c26, n = 2097152, CUDALucas v2.03 err = 0.2031 (0:18 real, 1.8168 ms/iter, ETA 19:09:45)
Titan is more efficient 250W vs 375W, but I guess th code for opencl is still early days.

-- Craig
nucleon is offline   Reply With Quote
Old 2013-10-18, 17:39   #256
Karl M Johnson
 
Karl M Johnson's Avatar
 
Mar 2010

41110 Posts
Default

Quote:
Originally Posted by nucleon View Post
Ok, only a fortnight late. 7990@1GHz/1.5GHz mem clk

GPU0:
[CODE]
Iteration 10000 M( 38000009 )C, 0xfd9116e3760e4571, n = 2097152, clLucas v1.01 err = 0.1406 (0:39 real, 3.8668 ms/iter, ETA 40:48:19)
Iteration 20000 M( 38000009 )C, 0xd91fe21f272e5099, n = 2097152, clLucas v1.01 err = 0.1406 (0:39 real, 3.8636 ms/iter, ETA 40:45:38)
Iteration 30000 M( 38000009 )C, 0x7ec1302ed5173c26, n = 2097152, clLucas v1.01 err = 0.1406 (0:39 real, 3.8766 ms/iter, ETA 40:53:16)
Based on that data, R9 290X may very roughly yield from 3.514 to 2.811 ms/LL iter for that exp using that FFT (using default and boost clocks respectively).
Karl M Johnson is offline   Reply With Quote
Old 2013-10-18, 19:46   #257
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23×271 Posts
Default

Perhaps. In gaming, it is neck and neck with the titan based on early benchies from Tom's and Anandtech, But compute will be what matters... Also, it is the first gpu to have a 512 bit memory bus.
kracker is offline   Reply With Quote
Old 2013-10-18, 20:00   #258
Manpowre
 
"Svein Johansen"
May 2013
Norway

3×67 Posts
Default

I get:

GTX 590 board:
Code:
Starting M38000009 fft length = 2097152
Iteration 10000 M( 38000009 )C, 0xfd9116e3760e4571, n = 2097152, CUDALucas v2.03
 err = 0.1914 (0:40 real, 3.9348 ms/iter, ETA 41:31:21)
Iteration 20000 M( 38000009 )C, 0xd91fe21f272e5099, n = 2097152, CUDALucas v2.03
 err = 0.1914 (0:38 real, 3.8383 ms/iter, ETA 40:29:37)
Iteration 30000 M( 38000009 )C, 0x7ec1302ed5173c26, n = 2097152, CUDALucas v2.03
 err = 0.1914 (0:39 real, 3.8376 ms/iter, ETA 40:28:33)
GTX Titan:
Code:
Iteration 10000 M( 38000009 )C, 0xfd9116e3760e4571, n = 2097152, CUDALucas v2.03
 err = 0.2090 (0:19 real, 1.8580 ms/iter, ETA 19:36:26)
Iteration 20000 M( 38000009 )C, 0xd91fe21f272e5099, n = 2097152, CUDALucas v2.03
 err = 0.2090 (0:17 real, 1.7758 ms/iter, ETA 18:44:04)
Iteration 30000 M( 38000009 )C, 0x7ec1302ed5173c26, n = 2097152, CUDALucas v2.03
 err = 0.2090 (0:18 real, 1.7750 ms/iter, ETA 18:43:17)
Iteration 40000 M( 38000009 )C, 0x91312c6488821eac, n = 2097152, CUDALucas v2.03
 err = 0.2090 (0:18 real, 1.7756 ms/iter, ETA 18:43:23)
Manpowre is offline   Reply With Quote
Old 2013-10-18, 20:17   #259
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

216810 Posts
Default

Interesting. So I guess 590 and 7990 almost tie.

Just for the sake of curiosity, I'm curious on titan timings without the DP switch.
kracker is offline   Reply With Quote
Old 2013-10-18, 20:27   #260
Manpowre
 
"Svein Johansen"
May 2013
Norway

3×67 Posts
Default

Quote:
Originally Posted by kracker View Post
Interesting. So I guess 590 and 7990 almost tie.

Just for the sake of curiosity, I'm curious on titan timings without the DP switch.
How do you get that kind of results without the DP switch ?
Mine switch is on. just logged into the node to check the setting.

sorry for hijacking cl thread.. but when I calculate, I get 58 ghz days LL testing with one titan per day with these timings on 73m exponents. It equals the statistics for that card, mabye a little better, but then I tweaked a little..

Last fiddled with by Manpowre on 2013-10-18 at 20:36
Manpowre is offline   Reply With Quote
Old 2013-10-19, 07:18   #261
Manpowre
 
"Svein Johansen"
May 2013
Norway

3·67 Posts
Default

I updated the titan thread with some timings for the titan, we dont need to discuss that here in LL thread :)
Manpowre is offline   Reply With Quote
Old 2013-10-20, 15:32   #262
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23·271 Posts
Default

Some timings for no reason really
2M FFT
HD-
7750: 15 ms
7770: 12 ms
7850: 8? ms
7870: 7 ms
7950: 4.2 ms
7970: 3.7 ms
7990: 3.9 ms(x2)
kracker is offline   Reply With Quote
Old 2013-10-20, 17:02   #263
Manpowre
 
"Svein Johansen"
May 2013
Norway

20110 Posts
Default

Quote:
Originally Posted by kracker View Post
Some timings for no reason really
2M FFT
HD-
7750: 15 ms
7770: 12 ms
7850: 8? ms
7870: 7 ms
7950: 4.2 ms
7970: 3.7 ms
7990: 3.9 ms(x2)
Very interesting actually.
It kind of shows us where to exptect R9 290x.. I guess 2.5-2.8ms with the same FFT length test ?
Manpowre is offline   Reply With Quote
Old 2013-10-21, 13:34   #264
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

72×197 Posts
Default

Quote:
Originally Posted by kracker View Post
Some timings for no reason really
2M FFT
HD-
7750: 15 ms
7770: 12 ms
7850: 8? ms
7870: 7 ms
7950: 4.2 ms
7970: 3.7 ms
7990: 3.9 ms(x2)
Confirm for 7990 and 7970, with about 5% longer for me (as I know, you push the memory clock, which I can't do, the cards seems not to be so stable, they are air cooled, and here is still hot during the day, when nobody home and no aircond running).
LaurV is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
Can't get OpenCL to work on HD7950 Ubuntu 14.04.5 LTS VictordeHolland Linux 4 2018-04-11 13:44
OpenCL accellerated lattice siever pstach Factoring 1 2014-05-23 01:03
OpenCL for FPGAs TObject GPU Computing 2 2013-10-12 21:09
AMD's Graphics Core Next- a reason to accelerate towards OpenCL? Belteshazzar GPU Computing 19 2012-03-07 18:58

All times are UTC. The time now is 07:13.


Mon Aug 2 07:13:41 UTC 2021 up 10 days, 1:42, 0 users, load averages: 1.77, 1.96, 1.74

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.