mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2013-11-18, 22:46   #2003
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

4738 Posts
Default

Quote:
Originally Posted by flashjh View Post
The new code is compiled and the windows binaries (release/debug) are posted on SourceForge.

@owftheevil: The -memtest functions, but something isn't right with the iterations. For example 56 1000 1 on my 580 says ETA 12181:18:07

I posted a working memtest.zip to sourceforge

EDIT: Please only use 2.05 Beta .exe files for testing the code. It is not ready for production use yet. Thanks!
That does seem a bit slow.

Usage:

Code:
./CUDALucas -memtest k n
where k * 25 MB of memory are tested, n * 10000 iterations are done for each of 5 data types at each of the k positions. So with k = 56, n = 1000 you are reading 75MB and writing 25 MB 2.8 billion times. Only ~39GB/s bandwidth on the reads. I'll take a look.
owftheevil is offline   Reply With Quote
Old 2013-11-18, 23:21   #2004
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

That same test before only took a few seconds.
flashjh is offline   Reply With Quote
Old 2013-11-19, 08:39   #2005
Manpowre
 
"Svein Johansen"
May 2013
Norway

3·67 Posts
Default

Quote:
Originally Posted by owftheevil View Post
That does seem a bit slow.

Usage:

Code:
./CUDALucas -memtest k n
where k * 25 MB of memory are tested, n * 10000 iterations are done for each of 5 data types at each of the k positions. So with k = 56, n = 1000 you are reading 75MB and writing 25 MB 2.8 billion times. Only ~39GB/s bandwidth on the reads. I'll take a look.
hmm, it is x 10k iterations.. oo.. thats different than from memtest right ?
second parameter was not multiplied with 10k ?
Manpowre is offline   Reply With Quote
Old 2013-11-19, 15:35   #2006
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

1001110112 Posts
Default

Looked at the ETA code for memtest last night. Didn't find anything wrong, but changed the formula to smooth out the results. Its working as expected on a 570 and 560 ti. New code up at sourceforge.

Code:
./CUDALucas -memtest 35 10
gives an ETA of just over 4 hours on the 560 ti.

Code:
./CUDALucas -memtest 28 2000
gives an ETA of just over 1200 hours on the 570 while it is simultaneously running stage 1 of CUDAPm1.
owftheevil is offline   Reply With Quote
Old 2013-11-19, 15:38   #2007
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

13B16 Posts
Default

Those of you having the fft too big problem while running the self check, could you please post the *fft.txt files, at least up to the line with fft 32 in it? I need to make sure I understand what the problem is.
owftheevil is offline   Reply With Quote
Old 2013-11-19, 16:13   #2008
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

21438 Posts
Default

Quote:
Originally Posted by owftheevil View Post
gives an ETA of just over 1200 hours on the 570 while it is simultaneously running stage 1 of CUDAPm1.
Maybe I missed some discussion about the memtest -- is it meant to run for 50 days in conjunction with CUDALucas or CUDAPm1?

Last fiddled with by flashjh on 2013-11-19 at 16:13
flashjh is offline   Reply With Quote
Old 2013-11-19, 16:21   #2009
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

32·5·7 Posts
Default

Quote:
Originally Posted by flashjh View Post
Maybe I missed some discussion about the memtest -- is it meant to run for 50 days in conjunction with CUDALucas or CUDAPm1?
No, I was just trying to guess at what might have given you such a large ETA.

Could you please try running CUDALucas with

Code:
-memtest 56 1
Add the -d 1 if tyou want it to run on device 1.
owftheevil is offline   Reply With Quote
Old 2013-11-19, 23:08   #2010
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

Recompiled from r43.

-memtest 56 1:
Code:
 
 Initializing memory test using 1400MB of memory on device 0
Beginning test.
 Position 0, Data Type 0, Iteration 10000, Errors: 0, completed 0.36%, Read 4.73G
B/s, Write 1.58GB/s, ETA 11:59:48)
-memtest 35 1:
Code:
Initializing memory test using 875MB of memory on device 0
Beginning test.
 Position 0, Data Type 0, Iteration 10000, Errors: 0, completed 0.06%, Read 117.1
4GB/s, Write 39.05GB/s, ETA 3:02:15)
Position 0, Data Type 0, Iteration 20000, Errors: 0, completed 0.11%, Read 117.0
8GB/s, Write 39.03GB/s, ETA 3:02:12)
Position 0, Data Type 0, Iteration 30000, Errors: 0, completed 0.17%, Read 117.0
9GB/s, Write 39.03GB/s, ETA 3:02:06)
Position 0, Data Type 0, Iteration 40000, Errors: 0, completed 0.23%, Read 117.0
8GB/s, Write 39.03GB/s, ETA 3:02:00)
Position 0, Data Type 0, Iteration 50000, Errors: 0, completed 0.29%, Read 117.0
7GB/s, Write 39.02GB/s, ETA 3:01:55)
Maybe I'm asking the wrong question -- On the original memtest you wrote, -memtest 56 1 only took a few seconds. Did you re-wrote the code to take 12 hours on purpose? Is the test 'updated' to run the way you think it needs to be written for a proper test?

Last fiddled with by flashjh on 2013-11-19 at 23:13
flashjh is offline   Reply With Quote
Old 2013-11-20, 14:47   #2011
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

32·5·7 Posts
Default

Quote:
Originally Posted by flashjh View Post
Recompiled from r43.

-memtest 56 1:
Code:
 
 Initializing memory test using 1400MB of memory on device 0
Beginning test.
 Position 0, Data Type 0, Iteration 10000, Errors: 0, completed 0.36%, Read 4.73G
B/s, Write 1.58GB/s, ETA 11:59:48)
-memtest 35 1:
Code:
Initializing memory test using 875MB of memory on device 0
Beginning test.
 Position 0, Data Type 0, Iteration 10000, Errors: 0, completed 0.06%, Read 117.1
4GB/s, Write 39.05GB/s, ETA 3:02:15)
Position 0, Data Type 0, Iteration 20000, Errors: 0, completed 0.11%, Read 117.0
8GB/s, Write 39.03GB/s, ETA 3:02:12)
Position 0, Data Type 0, Iteration 30000, Errors: 0, completed 0.17%, Read 117.0
9GB/s, Write 39.03GB/s, ETA 3:02:06)
Position 0, Data Type 0, Iteration 40000, Errors: 0, completed 0.23%, Read 117.0
8GB/s, Write 39.03GB/s, ETA 3:02:00)
Position 0, Data Type 0, Iteration 50000, Errors: 0, completed 0.29%, Read 117.0
7GB/s, Write 39.02GB/s, ETA 3:01:55)
Maybe I'm asking the wrong question -- On the original memtest you wrote, -memtest 56 1 only took a few seconds. Did you re-wrote the code to take 12 hours on purpose? Is the test 'updated' to run the way you think it needs to be written for a proper test?

Yes, kind of. Too few iterations, like 1000, will miss errors in marginal cases, so I made sure enough iterations are done on each part of the memory chunk. However, its not supposed to last as long on those settings as it is. Also, somethings wrong with your output. With 1 as the parameter for iterations, it should not be repeating Data Type 0. And for some reason, its not reading or writing very fast with 56 for the size of the memory chunk its testing.

Thanks for posting this. Now I have something to look at.

Edit: How much real time is it taking between screen updates in those two cases?

Edit 2: I have a new version up with a diagnostic line. Could you please try the same thing with the new version when you get a chance?

Last fiddled with by owftheevil on 2013-11-20 at 15:37
owftheevil is offline   Reply With Quote
Old 2013-11-20, 22:49   #2012
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

r46: -memtest 56 1
Code:
C:\CUDA\CuLu\test>CUDALucas_205Betar46 -memtest 56 1
 ------- DEVICE 0 -------
name                GeForce GTX 580
 
Initializing memory test using 1400MB of memory on device 0...
Input: size = 56, iterations = 1
Beginning test.
 Position 0, Data Type 0, Iteration 10000, Errors: 0, completed 0.36%, Read 4.63G
B/s, Write 1.54GB/s, ETA 12:14:50)
Position 0, Data Type 1, Iteration 20000, Errors: 0, completed 0.71%, Read 4.64G
B/s, Write 1.55GB/s, ETA 12:12:05)
Position 0, Data Type 2, Iteration 30000, Errors: 0, completed 1.07%, Read 4.61G
B/s, Write 1.54GB/s, ETA 12:10:50)
Position 0, Data Type 3, Iteration 40000, Errors: 0, completed 1.43%, Read 4.62G
B/s, Write 1.54GB/s, ETA 12:08:32)
Position 0, Data Type 4, Iteration 50000, Errors: 0, completed 1.79%, Read 4.63G
B/s, Write 1.54GB/s, ETA 12:05:44)
Observations:

Before the GPU would stay at 100% usage, now every few seconds it drops down to between 20% to 80% and then goes back to 100%

I timed the last group. CUDALucas says 2:48 elapsed, real time was 2:37.9

-memtest 35 1
Code:
C:\CUDA\CuLu\test>CUDALucas_205Betar46 -memtest 35 1
 ------- DEVICE 0 -------
name                GeForce GTX 580
 
Initializing memory test using 875MB of memory on device 0...
Input: size = 35, iterations = 1
Beginning test.
 Position 0, Data Type 0, Iteration 10000, Errors: 0, completed 0.57%, Read 125.0
3GB/s, Write 41.68GB/s, ETA 16:59)
Position 0, Data Type 1, Iteration 20000, Errors: 0, completed 1.14%, Read 124.9
4GB/s, Write 41.65GB/s, ETA 16:53)
Position 0, Data Type 2, Iteration 30000, Errors: 0, completed 1.71%, Read 124.6
9GB/s, Write 41.56GB/s, ETA 16:48)
Position 0, Data Type 3, Iteration 40000, Errors: 0, completed 2.29%, Read 125.0
6GB/s, Write 41.69GB/s, ETA 16:42)
Position 0, Data Type 4, Iteration 50000, Errors: 0, completed 2.86%, Read 124.7
3GB/s, Write 41.58GB/s, ETA 16:36)
Position 1, Data Type 0, Iteration 60000, Errors: 0, completed 3.43%, Read 124.8
9GB/s, Write 41.63GB/s, ETA 16:31)
Observations:

Usage stays at 100%

CUDALucas Time: 5 sec, Timed: 5.8 sec

Last fiddled with by flashjh on 2013-11-20 at 22:55
flashjh is offline   Reply With Quote
Old 2013-11-21, 14:48   #2013
owftheevil
 
owftheevil's Avatar
 
"Carl Darby"
Oct 2012
Spring Mountains, Nevada

4738 Posts
Default

I was able to get into windows last night to run some tests. I'm seeing the same thing you are. On a 570 with 1250MB of memory,

-memtest 41 1

runs normally, from 42 up to 46 its very slow like what you see with 56, at 47 it can't allocate all the memory and throws a cuda error.

On Linux, everything is as expected. Up to 47, it runs full speed with no problems, at 48 it can't allocate the memory.
owftheevil is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Don't DC/LL them with CudaLucas LaurV Data 131 2017-05-02 18:41
CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 Brain GPU Computing 13 2016-02-19 15:53
CUDALucas: which binary to use? Karl M Johnson GPU Computing 15 2015-10-13 04:44
settings for cudaLucas fairsky GPU Computing 11 2013-11-03 02:08
Trying to run CUDALucas on Windows 8 CP Rodrigo GPU Computing 12 2012-03-07 23:20

All times are UTC. The time now is 07:18.


Fri Aug 6 07:18:57 UTC 2021 up 14 days, 1:47, 1 user, load averages: 3.02, 2.80, 2.72

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.