mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2013-05-03, 09:36   #188
firejuggler
 
firejuggler's Avatar
 
Apr 2010
Over the rainbow

2·1,303 Posts
Default

grab the windows binary, there is a ini file in it that might help you.
higher fftlength?

Last fiddled with by firejuggler on 2013-05-03 at 09:36
firejuggler is online now   Reply With Quote
Old 2013-05-03, 10:23   #189
Karl M Johnson
 
Karl M Johnson's Avatar
 
Mar 2010

3×137 Posts
Default

With e=12, d=2310 and nrp=480, the last exponent, which can be checked on current binary is 14,155,777.
The next exponent, 14,155,807, cant go to stage 2.

Now, the real vRAM usage for CPm1 for the 14,155,777 exp is ~3073MB (MSI afterburner delta method), reported approx. vRAM usage was 3014MB.

As a conclusion of this micro research, if you see approx. memory usage of >=3139MB, be sure that stage 2 will not work, even if you have a lot more than that.

Proof:
http://i.imgur.com/iUpQaMr.png
http://i.imgur.com/W8fqlWQ.png

Last fiddled with by Karl M Johnson on 2013-05-03 at 10:27
Karl M Johnson is offline   Reply With Quote
Old 2013-05-03, 11:57   #190
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

D5D16 Posts
Default

Quote:
Originally Posted by frmky View Post
If the factor is found in stage 1, should the value of B2 in the output be equal to B1 as in the following:
M55824233 has a factor: 833043841114609831879 (P-1, B1=839, B2=839, e=6, n=3072K CUDAPm1 v0.00)
If so, that's an easy change.
Yes, please, it would be helpful if the results indicated that.
James Heinrich is offline   Reply With Quote
Old 2013-05-03, 12:06   #191
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

11·311 Posts
Default

Quote:
Originally Posted by frmky View Post
here's the next version to try.
Starting a new run looks better than last time:
Code:
Selected B1=560000, B2=14280000, 3.55% chance of finding a factor
CUDA reports 781M of 1279M GPU memory free.
Using e=6, d=2310, nrp=12
Using approximately 744M GPU memory.
Starting stage 1 P-1, M60817711, B1 = 560000, B2 = 14280000, e = 6, fft length = 3360K
Doing 807829 iterations
I'll let it run and see if it finds the known stage2 factor.
James Heinrich is offline   Reply With Quote
Old 2013-05-03, 14:02   #192
kjaget
 
kjaget's Avatar
 
Jun 2005

8116 Posts
Default

Quote:
Originally Posted by frmky View Post
Still don't have the motivation to track down the problem reading text from ini files

Remove the #define sscanf sscanf_s line from parse.c. Using sscanf_s requires each string var scanned into to be followed by an argument with the size of that string, but that's not done in the sscanf call in IniGetStr. This means the sscanf_s checking picks a random uninitialized value off the stack for the length of the dest string, leading to random failures.

A real fix is implementing a wrapper like the sprintf() one which includes this parameter in the call to sscanf_s. Or just ignore the safe version of this function since it is more trouble than it is worth.

Last fiddled with by kjaget on 2013-05-03 at 14:04
kjaget is offline   Reply With Quote
Old 2013-05-03, 15:45   #193
Stef42
 
Feb 2012
the Netherlands

2×29 Posts
Default

I'm getting a lot of cudaDeviceSynchronize() error 30...
Usually on high B2 value's while only 400-500MB is used (low exponents).
Why this might have happened: http://stackoverflow.com/questions/1...d-kernel-calls

Last fiddled with by Stef42 on 2013-05-03 at 15:46
Stef42 is offline   Reply With Quote
Old 2013-05-03, 17:34   #194
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

11×311 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
I'll let it run and see if it finds the known stage2 factor.
It did:
Code:
Stage 2 complete, estimated total time = 2:57:29
Accumulated Product: M60817711, 0x978923630c42303f, n = 3360K, CUDAPm1 v0.00
Starting stage 2 gcd.
M60817711 has a factor: 3493866477323309653137460319 (P-1, B1=560000, B2=14280000, e=6, n=3360K CUDAPm1 v0.00)
4.212GHz-days in 2h57m29s = 34GHz-days/day. A far cry from the ~400GHd/d the GTX570 can push in mfaktc, but also notably faster than can be done on my CPU.
James Heinrich is offline   Reply With Quote
Old 2013-05-03, 18:26   #195
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

22·232 Posts
Default

Quote:
Originally Posted by Karl M Johnson View Post
As a conclusion of this micro research, if you see approx. memory usage of >=3139MB, be sure that stage 2 will not work, even if you have a lot more than that.
Perhaps because it is a 32-bit binary? I'll try creating a 64-bit binary later today.
frmky is offline   Reply With Quote
Old 2013-05-03, 18:27   #196
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

22×232 Posts
Default

Quote:
Originally Posted by kjaget View Post
Remove the #define sscanf sscanf_s line from parse.c.
Thanks!
frmky is offline   Reply With Quote
Old 2013-05-04, 06:18   #197
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

1000010001002 Posts
Default

New versions ...
Win32:
https://www.dropbox.com/s/alz4xodjje...2_20130503.zip
x64:
https://www.dropbox.com/s/gbs9pr3ily...4_20130503.zip

The x64 version should allow you to use more than 3GB (or 4GB, not sure which limit applies to GPU ram) of memory if your card has that much. Also, the GCD at the end will likely be a bit faster, but it doesn't really take that long anyway. As usual, please let me know of problems.
frmky is offline   Reply With Quote
Old 2013-05-04, 06:21   #198
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

41048 Posts
Default

Quote:
Originally Posted by Stef42 View Post
I'm getting a lot of cudaDeviceSynchronize() error 30...
Usually on high B2 value's while only 400-500MB is used (low exponents).
Why this might have happened: http://stackoverflow.com/questions/1...d-kernel-calls
Hmmm. Try the 64-bit version to see if it makes any difference. If it persists, we can try adding cudaDeviceSynchronize() as well, but that seemed to be hit-or-miss in the discussions.
frmky is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3497 2021-06-05 12:27
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51
World's dumbest CUDA program? xilman Programming 1 2009-11-16 10:26
Factoring program need help Citrix Lone Mersenne Hunters 8 2005-09-16 02:31
Factoring program ET_ Programming 3 2003-11-25 02:57

All times are UTC. The time now is 07:28.


Mon Aug 2 07:28:22 UTC 2021 up 10 days, 1:57, 0 users, load averages: 1.27, 1.22, 1.38

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.