mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
Thread Tools
Old 2010-06-05, 18:43   #265
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

21278 Posts
Default

Hi Jason,

Quote:
Originally Posted by jasonp View Post
Maybe you could include the same kernel compiled two different ways, then choose which one to use based on the number of GPU registers as reported by the Nvidia driver.
I would say that this is not needed. The 71bit kernel uses 16 registers with and without the limit to 16 registers.
The 75bit kernel uses 17/16 (limit off / on) and the 95bit kernel uses 18/16 registers. 17 or 18 is basically the same since registers are allocated in groups of 4.
I made some tests on my GTX 275 (compute capability 1.3, 16384 register per multiprocessor) and didn't notice any difference in performance. With and without limit to 16 registers it runs a the same speed.
On my 8400GS (compute capability 1.1, 8192 register per multiprocessor) it runs ~1% faster with the limit to 16 registers! It think this is related to the fact that the occupancy is higher.
16 registers * 256 threads per block = 4096 registers per block ==> two blocks can run at the same time one the same multiprocessor!
(With 192 threads per block I could use 20 registers per block an run 2 blocks at the same time...)
TheJudger is offline   Reply With Quote
Old 2010-06-05, 20:14   #266
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

100010101112 Posts
Default

Quote:
Originally Posted by TheJudger View Post
(With 192 threads per block I could use 20 registers per block an run 2 blocks at the same time...)
less than 0.1% faster than 256 threads per block and limit to 16 registers on GS8400. Definitely not worth the extra work/code.
TheJudger is offline   Reply With Quote
Old 2010-06-06, 12:09   #267
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11×101 Posts
Default

Hi David,

Quote:
Originally Posted by henryzz View Post
Currently at OBD all the available assignments are taking numbers on from 75 bits or more. Based on testing upto 70 bits 75-76 will take me ~8.4 hours. I can't often guarantee that my pc will be running that long at once but I would like to help out a bit. Is there any chance of making partial bit levels available or having some sort of saving feature.
mfaktc 0.08 has resume capability. Release is planned for the next few days.

Oliver
TheJudger is offline   Reply With Quote
Old 2010-06-06, 12:25   #268
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

7×292 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Hi David,



mfaktc 0.08 has resume capability. Release is planned for the next few days.

Oliver
Brilliant news.
I wait expectantly.
henryzz is online now   Reply With Quote
Old 2010-06-09, 10:15   #269
Karl M Johnson
 
Karl M Johnson's Avatar
 
Mar 2010

3·137 Posts
Default

How do I force Prime95 to bench a exponent with fixed bounds like bit_min and bit_max similar to mfaktc ?
How should worktodo.txt file look like inside?
Karl M Johnson is offline   Reply With Quote
Old 2010-06-09, 17:56   #270
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

12D316 Posts
Default

Quote:
Originally Posted by Karl M Johnson View Post
How do I force Prime95 to bench a exponent with fixed bounds like bit_min and bit_max similar to mfaktc ?
How should worktodo.txt file look like inside?
worktodo.txt:
Factor=bla,exponent,bitmin,bitmax

Luigi
ET_ is offline   Reply With Quote
Old 2010-06-09, 21:38   #271
Karl M Johnson
 
Karl M Johnson's Avatar
 
Mar 2010

3·137 Posts
Default

Prime95 doesnt like that bla. I assume feeding a random hash of required length will calm it. Here's an example from PrimeNET : hash,49653607,69,0. Now, why is bitmin 69 and bitmax 0 ? Is bitmax = 0 = infinity ?
Karl M Johnson is offline   Reply With Quote
Old 2010-06-09, 21:45   #272
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

10AB16 Posts
Default

Quote:
Originally Posted by Karl M Johnson View Post
Prime95 doesnt like that bla. I assume feeding a random hash of required length will calm it.
If you don't have an assignment key (that's what's supposed to go in place of the "bla"), leave it blank with no leading comma, (Factor=exponent,bitmin,bitmax) or put "N/A" (Factor=N/A,exponent,bitmin,bitmax).
Quote:
Originally Posted by Karl M Johnson View Post
Here's an example from PrimeNET : hash,49653607,69,0. Now, why is bitmin 69 and bitmax 0 ? Is bitmax = 0 = infinity ?
Is that example line from an LL test or DC and not a TF assignment? For Test= and DoubleCheck=, the last part there (the ",0") isn't bitmax, it's has_been_pminus1ed (1 if the number has had a P-1 run, 0 if it hasn't).
Mini-Geek is offline   Reply With Quote
Old 2010-06-10, 02:42   #273
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

2·67·73 Posts
Default

Quote:
Originally Posted by Karl M Johnson View Post
Prime95 doesnt like that bla. I assume feeding a random hash of required length will calm it. Here's an example from PrimeNET : hash,49653607,69,0. Now, why is bitmin 69 and bitmax 0 ? Is bitmax = 0 = infinity ?
Personally, I use "DEADBEEFDEADBEEFDEADBEEFDEADBEEF" for "bla". It's legitimate hexadecimal.
chalsall is online now   Reply With Quote
Old 2010-06-10, 08:52   #274
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11×101 Posts
Default

Hi,

find attached mfaktc 0.08.

Highlights:
- 2 new GPU kernels for factors up to 2^75 and 2^95 (above 2^90 isn't tested very well )
- resume capability

For details take a look at Changelog.txt and README.txt.

Thank you Luigi (ET_) and Kevin (kjaget) for testing and comments!

Oliver
Attached Files
File Type: gz mfaktc-0.08.tar.gz (76.4 KB, 120 views)
TheJudger is offline   Reply With Quote
Old 2010-06-10, 08:54   #275
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11·101 Posts
Default

Hi Karl,

Quote:
Originally Posted by Karl M Johnson View Post
Prime95 doesnt like that bla. I assume feeding a random hash of required length will calm it. Here's an example from PrimeNET : hash,49653607,69,0. Now, why is bitmin 69 and bitmax 0 ? Is bitmax = 0 = infinity ?
I think you have to override the factor defaults. Prime95 automatically sets the upper limit unless you override it. I would take a look into undoc.txt and search for "factor override" in the forum/web.

Oliver
TheJudger is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 32 2020-11-11 19:56
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 22:00.


Fri Aug 6 22:00:23 UTC 2021 up 14 days, 16:29, 1 user, load averages: 2.48, 2.73, 2.67

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.