mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2012-03-31, 00:10   #1750
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

11×101 Posts
Default

Quote:
Originally Posted by Prime95 View Post
Does anyone know if add.cc runs runs on 168 cores or does it get restricted to 32 or even worse 8 cores??
I don't know yet. I'm still waiting to get my hands on a GTX 680.

Quote:
Originally Posted by Prime95 View Post
In general, how does one know which PTX instructions map to actual hardware instructions? If it's emulated, how does one see which instructions are used to emulate the PTX instruction?
This is one of nvidias secrets.

For GK104 they totally crippelt int32 performance in alot ways.
I guess that I need to code new kernels for CC 3.0 but this may take some time. For mfaktc the reduced number of registers (per core) and the reduced L1/shared memory (per core) is OK but the crippelt int32 performance is really bad.

Oliver
TheJudger is offline   Reply With Quote
Old 2012-04-01, 03:43   #1751
nucleon
 
nucleon's Avatar
 
Mar 2003
Melbourne

5×103 Posts
Default

And to make the LL v TF calculations that much harder... Re the comment made to ram selection for video cards.

Can we gather any DC LL stats yet on GPUs? Are they matching more/less often?

-- Craig
nucleon is offline   Reply With Quote
Old 2012-04-15, 16:32   #1752
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

21278 Posts
Default

Hi,

since Kepler "light" aka GK104 sucks at integer code: any idea whether it is feasible to use a couple of 32bit floats for "small long integers" or not? Primary I need addition, subtraction and multiplication of integers with ~80/160 bits of data.

Oliver
TheJudger is offline   Reply With Quote
Old 2012-04-15, 17:06   #1753
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3·199 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Hi,

since Kepler "light" aka GK104 sucks at integer code: any idea whether it is feasible to use a couple of 32bit floats for "small long integers" or not? Primary I need addition, subtraction and multiplication of integers with ~80/160 bits of data.

Oliver
Using my recent 5x15bit barrett kernel could probably be a good start, as you would have only 23 mantissa bits to use. Addition and subtraction should be no problem, but multiplication would need some work as I rely on 32 bits for the result of a 15x15bits multiplication plus carry of up to 17 bits in one mad24. Certainly a solvable problem.

But I have no idea if that would be any faster than pure integer ...
Bdot is offline   Reply With Quote
Old 2012-04-16, 13:48   #1754
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

72·197 Posts
Default

@bcp19: Would you like to post the version of mfaktc that you used to TF M2137, M2267 and M2273 from 60 to 61 bits? I am curious how did you split the classes, as it is not enough to modify the limit in the source of 0.18 and recompile. Anyhow, doing those TF's makes no sense, as how many ECM was done in that area, there should be no factors below 180 bits. One can do "fake reports" there and get a lot of TF credit (thousands of GHzDays), like reporting all exponents TF-ed to 70 or 80 bits, and still have no negative influence on the project. From the percent of the ECM done, there is no factor below 40 to 60 digits (depending on exponent) on that range of expos below 5000, that means no factor below 120-180 bits. So theoretically, if I want to surpass Xyzzy and Nucleon at TF, I could report all of them to 65 bits, without influencing negative the project in a whole. But this is childish. So please, could you post the "test" version of mfaktc that you used to do those tests? No harm intended, just being curious how did you solve the problem.
LaurV is offline   Reply With Quote
Old 2012-04-16, 16:25   #1755
bcp19
 
bcp19's Avatar
 
Oct 2011

10101001112 Posts
Default

Quote:
Originally Posted by LaurV View Post
@bcp19: Would you like to post the version of mfaktc that you used to TF M2137, M2267 and M2273 from 60 to 61 bits? I am curious how did you split the classes, as it is not enough to modify the limit in the source of 0.18 and recompile. Anyhow, doing those TF's makes no sense, as how many ECM was done in that area, there should be no factors below 180 bits. One can do "fake reports" there and get a lot of TF credit (thousands of GHzDays), like reporting all exponents TF-ed to 70 or 80 bits, and still have no negative influence on the project. From the percent of the ECM done, there is no factor below 40 to 60 digits (depending on exponent) on that range of expos below 5000, that means no factor below 120-180 bits. So theoretically, if I want to surpass Xyzzy and Nucleon at TF, I could report all of them to 65 bits, without influencing negative the project in a whole. But this is childish. So please, could you post the "test" version of mfaktc that you used to do those tests? No harm intended, just being curious how did you solve the problem.
It's a special version that I asked TheJudger to make for me back in December or January, so I am unable to tell you what was changed as he did the work, compiled it and sent me the exe. I tested it on around 5000 known factored exponents, giving him feedback to make a few tweeks, before I started using it full time. It's only able to use about 1/3 of a 450, so the run time is around 4.5days on those 2k exponents, and it has been running since early Feb doing some 20-40k exps, but it's slow enough that I only tend to check it once or twice a month.
bcp19 is offline   Reply With Quote
Old 2012-04-17, 06:17   #1756
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

72·197 Posts
Default

That is wonderful! May I be part of the "testing team"? I would like to play with lower exponents too. Please send me some win-64 binaries and tell me what expos to attack so we don't step on each-other toes.

Please see this post here too, related to this problem.
LaurV is offline   Reply With Quote
Old 2012-04-17, 17:04   #1757
bcp19
 
bcp19's Avatar
 
Oct 2011

7×97 Posts
Default

Quote:
Originally Posted by LaurV View Post
That is wonderful! May I be part of the "testing team"? I would like to play with lower exponents too. Please send me some win-64 binaries and tell me what expos to attack so we don't step on each-other toes.

Please see this post here too, related to this problem.
PM me your e-mail addy and I'll send it out. After the 2-3k run, I'll finish the 20-40k's and then probably work on something between 100k and 1M.
bcp19 is offline   Reply With Quote
Old 2012-04-17, 17:19   #1758
diamonddave
 
diamonddave's Avatar
 
Feb 2004

25·5 Posts
Default

Quote:
Originally Posted by bcp19 View Post
PM me your e-mail addy and I'll send it out. After the 2-3k run, I'll finish the 20-40k's and then probably work on something between 100k and 1M.
Why don't we make that version available here? I know I would also like to play with it!

http://www.mersenneforum.org/mfaktc/
diamonddave is offline   Reply With Quote
Old 2012-04-17, 17:39   #1759
bcp19
 
bcp19's Avatar
 
Oct 2011

7×97 Posts
Default

Quote:
Originally Posted by diamonddave View Post
Why don't we make that version available here? I know I would also like to play with it!

http://www.mersenneforum.org/mfaktc/
I don't know how to put it on there. Also, there are no DOC files to go with it.

Last fiddled with by bcp19 on 2012-04-17 at 17:48
bcp19 is offline   Reply With Quote
Old 2012-04-17, 17:46   #1760
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

72·197 Posts
Default

I think that the author is the one who can decide if he wants it to make it public or not. Of course I would like to have it, to play with it, but maybe he has a good reason why didn't make it public. It could be still under test, or under development, I tried in the past to modify it by myself, when I found that is not enough to change the lower limit constrain, and in fact, because there are much more candidates to test when the expo is small, the modification is not easy to do.
LaurV is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 32 2020-11-11 19:56
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 07:29.


Mon Aug 2 07:29:46 UTC 2021 up 10 days, 1:58, 0 users, load averages: 1.51, 1.32, 1.40

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.