mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   mfaktc: a CUDA program for Mersenne prefactoring (https://www.mersenneforum.org/showthread.php?t=12827)

TheJudger 2012-03-31 00:10

[QUOTE=Prime95;294466]Does anyone know if add.cc runs runs on 168 cores or does it get restricted to 32 or even worse 8 cores??[/QUOTE]
I don't know yet. I'm still waiting to get my hands on a GTX 680.

[QUOTE=Prime95;294466]In general, how does one know which PTX instructions map to actual hardware instructions? If it's emulated, how does one see which instructions are used to emulate the PTX instruction?[/QUOTE]
This is one of nvidias secrets. :sad:

For GK104 they totally crippelt int32 performance in alot ways. :furious:
I guess that I need to code new kernels for CC 3.0 but this may take some time. For mfaktc the reduced number of registers (per core) and the reduced L1/shared memory (per core) is OK but the crippelt int32 performance is really bad.

Oliver

nucleon 2012-04-01 03:43

And to make the LL v TF calculations that much harder... Re the comment made to ram selection for video cards.

Can we gather any DC LL stats yet on GPUs? Are they matching more/less often?

-- Craig

TheJudger 2012-04-15 16:32

Hi,

since Kepler "light" aka GK104 sucks at integer code: any idea whether it is feasible to use a couple of 32bit floats for "small long integers" or not? Primary I need addition, subtraction and multiplication of integers with ~80/160 bits of data.

Oliver

Bdot 2012-04-15 17:06

[QUOTE=TheJudger;296462]Hi,

since Kepler "light" aka GK104 sucks at integer code: any idea whether it is feasible to use a couple of 32bit floats for "small long integers" or not? Primary I need addition, subtraction and multiplication of integers with ~80/160 bits of data.

Oliver[/QUOTE]

Using my recent 5x15bit barrett kernel could probably be a good start, as you would have only 23 mantissa bits to use. Addition and subtraction should be no problem, but multiplication would need some work as I rely on 32 bits for the result of a 15x15bits multiplication plus carry of up to 17 bits in one mad24. Certainly a solvable problem.

But I have no idea if that would be any faster than pure integer ...

LaurV 2012-04-16 13:48

@bcp19: Would you like to post the version of mfaktc that you used to TF M2137, M2267 and M2273 from 60 to 61 bits? I am curious how did you split the classes, as it is not enough to modify the limit in the source of 0.18 and recompile. Anyhow, doing those TF's makes no sense, as how many ECM was done in that area, there should be no factors below 180 bits. One can do "fake reports" there and get a lot of TF credit (thousands of GHzDays), like reporting all exponents TF-ed to 70 or 80 bits, and still have no negative influence on the project. From the percent of the ECM done, there is no factor below 40 to 60 digits (depending on exponent) on that range of expos below 5000, that means no factor below 120-180 bits. So theoretically, if I want to surpass Xyzzy and Nucleon at TF, I could report all of them to 65 bits, without influencing negative the project in a whole. But this is childish. So please, could you post the "test" version of mfaktc that you used to do those tests? No harm intended, just being curious how did you solve the problem.

bcp19 2012-04-16 16:25

[QUOTE=LaurV;296501]@bcp19: Would you like to post the version of mfaktc that you used to TF M2137, M2267 and M2273 from 60 to 61 bits? I am curious how did you split the classes, as it is not enough to modify the limit in the source of 0.18 and recompile. Anyhow, doing those TF's makes no sense, as how many ECM was done in that area, there should be no factors below 180 bits. One can do "fake reports" there and get a lot of TF credit (thousands of GHzDays), like reporting all exponents TF-ed to 70 or 80 bits, and still have no negative influence on the project. From the percent of the ECM done, there is no factor below 40 to 60 digits (depending on exponent) on that range of expos below 5000, that means no factor below 120-180 bits. So theoretically, if I want to surpass Xyzzy and Nucleon at TF, I could report all of them to 65 bits, without influencing negative the project in a whole. But this is childish. So please, could you post the "test" version of mfaktc that you used to do those tests? No harm intended, just being curious how did you solve the problem.[/QUOTE]

It's a special version that I asked TheJudger to make for me back in December or January, so I am unable to tell you what was changed as he did the work, compiled it and sent me the exe. I tested it on around 5000 known factored exponents, giving him feedback to make a few tweeks, before I started using it full time. It's only able to use about 1/3 of a 450, so the run time is around 4.5days on those 2k exponents, and it has been running since early Feb doing some 20-40k exps, but it's slow enough that I only tend to check it once or twice a month.

LaurV 2012-04-17 06:17

That is wonderful! May I be part of the "testing team"? I would like to play with lower exponents too. Please send me some win-64 binaries and tell me what expos to attack so we don't step on each-other toes.

Please see [URL="http://www.mersenneforum.org/showpost.php?p=296593&postcount=10"]this post here[/URL] too, related to this problem.

bcp19 2012-04-17 17:04

[QUOTE=LaurV;296594]That is wonderful! May I be part of the "testing team"? I would like to play with lower exponents too. Please send me some win-64 binaries and tell me what expos to attack so we don't step on each-other toes.

Please see [URL="http://www.mersenneforum.org/showpost.php?p=296593&postcount=10"]this post here[/URL] too, related to this problem.[/QUOTE]

PM me your e-mail addy and I'll send it out. After the 2-3k run, I'll finish the 20-40k's and then probably work on something between 100k and 1M.

diamonddave 2012-04-17 17:19

[QUOTE=bcp19;296646]PM me your e-mail addy and I'll send it out. After the 2-3k run, I'll finish the 20-40k's and then probably work on something between 100k and 1M.[/QUOTE]

Why don't we make that version available here? I know I would also like to play with it! :smile:

[URL="http://www.mersenneforum.org/mfaktc/"]http://www.mersenneforum.org/mfaktc/[/URL]

bcp19 2012-04-17 17:39

[QUOTE=diamonddave;296647]Why don't we make that version available here? I know I would also like to play with it! :smile:

[URL]http://www.mersenneforum.org/mfaktc/[/URL][/QUOTE]

I don't know how to put it on there. Also, there are no DOC files to go with it.

LaurV 2012-04-17 17:46

I think that the author is the one who can decide if he wants it to make it public or not. Of course I would like to have it, to play with it, but maybe he has a good reason why didn't make it public. It could be still under test, or under development, I tried in the past to modify it by myself, when I found that is not enough to change the lower limit constrain, and in fact, because there are much more candidates to test when the expo is small, the modification is not easy to do.


All times are UTC. The time now is 23:17.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.