mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2012-08-30, 18:25   #1882
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

2·3·1,693 Posts
Default

Doesn't shorter run time correspond to higher exponents?
kladner is offline   Reply With Quote
Old 2012-08-30, 18:28   #1883
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×29×83 Posts
Default

Quote:
Originally Posted by kladner View Post
Doesn't shorter run time correspond to higher exponents?
Typically, yes, although doing lower bit levels decreases the run-time as well. It helps that the higher exponents are at a lower bit level anyways, so you get "double the savings" so to speak.
Dubslow is offline   Reply With Quote
Old 2012-08-30, 18:57   #1884
NormanRKN
 
NormanRKN's Avatar
 
Jul 2012
Saarland / Germany

6810 Posts
Default

OK, i´ve understand.
thank you guys !

Norman
NormanRKN is offline   Reply With Quote
Old 2012-08-31, 04:04   #1885
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

100101101111112 Posts
Default

Quote:
Originally Posted by TheJudger View Post
Yes, it is possible!
How about a faster 67.13 bit kernel (whatever, but not more then 67 bits in factors, and faster ) and running for small expos (the same like the one distributed before to bcp, me, few others?). If you ever put it on your todo list, don't forget to PM me a link to it.

[edit: and to be on topic: less classes versus normal: The end of the line is that, if you are nitpicker/pettifogger like me, you have to test and compare both versions for your particular system. The "less classes" version would be better for more ranges on a system having a low-end GPU and a top-class CPU (as Oliver said, it is more CPU intensive for sieving). When I did the 332M-333M ranges to 70 bits for Uncwilly, I tested them comparatively and I found out that in my system (heavily strangled by the CPU power, the i7-2600k can't keep up with all GPU's I have in it (usually 2 gtx580, occasionaly a third or a tesla), and to max the GPU's when I run mfaktc, I must run ONLY mfaktc; immediately after I start anything else, like P95, aliqueit, etc, then the GPUs occupancy goes down), so in my system, the "normal" version still performs much better for those expo ranges, over 65 bits [edit2: and 0.19 with its lower SievePrimes performs even BETTER]. In fact, the "less classes" version is faster under 65 bits, but it makes no sense to use it, as mfaktc will do (for this range) the 0-68 bits ALL-IN-ONE chunk, then another two chunks for 69 and 70 bits.

tl;dr: if you plan to do mfaktc intensively, do a bit of tuning first. You may be surprised of what your system can do
end of edit]

Last fiddled with by LaurV on 2012-08-31 at 04:24
LaurV is offline   Reply With Quote
Old 2012-08-31, 11:59   #1886
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

23×149 Posts
Default

Quote:
Originally Posted by LaurV View Post
If you ever put it on your todo list, don't forget to PM me a link to it.
Me too! I'm also one of those who like to wade through small factors.
James Heinrich is offline   Reply With Quote
Old 2012-08-31, 23:19   #1887
lycorn
 
lycorn's Avatar
 
"GIMFS"
Sep 2002
Oeiras, Portugal

2×11×67 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
small factors.
You mean small exponents, don´t you?
I think you and LaurV are referring to the version that allows to test expos <1M, and yes, I would also like to get such version.
lycorn is offline   Reply With Quote
Old 2012-09-01, 00:31   #1888
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

23×149 Posts
Default

Quote:
Originally Posted by lycorn View Post
You mean small exponents, don´t you?
I think you and LaurV are referring to the version that allows to test expos <1M, and yes, I would also like to get such version.
No, (at least for myself) I'm referring to looking for small factors on large exponents. For example I've cleared most of the 801M range to 64-bit, 65-bit, etc; now working up to 70-bit. Or, even more pronounced, pre-factoring in the OBD range (~3322M) to clear out the super-tiny (<60-bit) factors.
James Heinrich is offline   Reply With Quote
Old 2012-09-01, 07:43   #1889
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

3·3,221 Posts
Default

I was indeed talking about small exponents, see my post, the same version of software lycorn mentioned. That software was distributed to a "trusted" lot of crunchers (and I am proud that Oliver put me on that lot), and it was used to look for factors of mersenne numbers with exponents between 2K and 1M, from 60 to 65 bits. I personally took from 60 to 63 a series of exponents which had not so much ECM done on them. The version of that code based on mfaktc 0.18 which I have, I still use occasionally when some corner of the GPU is free. It is very slow, first of because 63 bits means a lot for those small expos (same amount of work like a 70-76 bit assignment for a LL-front exponent) and taking into account that we are not targeting bit-levels higher then - say - 65, then a lot of improvement could be done there too.

Unfortunately the biggest problem of that range is not the bit level, but the sieving process, you have to be careful how high you sieve the classes to avoid eliminating factors (never sieve with primes higher then 2*p, in fact the program never sieves with primes higher then p, this could be improved too, by selecting 2*p if p is 3 (mod 4) or 6*p if p is 1 (mod 4)) which lets behind a lot of candidates for exponentiation, then if the programmer is not careful, mfaktc can run into memory troubles. Handling those things is difficult and makes the program slower. Oliver did a lot of work to be able to lower the exponents so much (to 2k, instead of 1M like the default mfaktc). Using this version for "normal work" is much slower, it is only dedicated to guys who wanna waste their time looking for factors of mersenne numbers with small prime exponents (as I said already on the forum, this is somehow "wasting time": due to the amount of ECM done on that range, there would be no factors below 2^100 or so, remaining undiscovered. Our fun stays in raising the "how far factored" and eventually finding an "ecm miss"... well this never happened up to now, but it would be a nice headline!).

What James is talking about is a different story. He wants to find all small factors very fast, for high exponents. I already sent him a list with all factors from 0 to 37 bits and expos from 0 to 10G (it took me almost one day to upload it on his server!), and I am on the way to add more bits to it, but the uploading process is very slow, and we are talking about many gigabytes of data.

BTW, James, if you are only interested in exponents below 2^32 (4G29), then the current version (normal build) of mfaktc 0.19 can do this and is very fast.

Last fiddled with by LaurV on 2012-09-01 at 07:52
LaurV is offline   Reply With Quote
Old 2012-09-01, 14:34   #1890
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

23·149 Posts
Default

Quote:
Originally Posted by LaurV View Post
BTW, James, if you are only interested in exponents below 2^32 (4G29), then the current version (normal build) of mfaktc 0.19 can do this and is very fast.
Too fast. Even running 6 instances of mfaktc, and taking exponents up to "only" 2^64 I can't get above about 80% GPU usage, and throughput estimates are wildly all over the place (from 10 to 80GHz-day/day per instance, jumping like mad; a sure sign of inefficiency (lack of buffer or I don't know what, but certainly not optimal).
James Heinrich is offline   Reply With Quote
Old 2012-09-01, 16:14   #1891
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

226778 Posts
Default

@Oliver: small cosmetic for 0.19 less classes version (I only tested win64): it still displays 4620 classes (like 0/4620, 1/4620 .... 419/4620). better check that compiler option against hard coded screen messages Otherwise, it seems to work wonderfully well.

Last fiddled with by LaurV on 2012-09-01 at 16:15
LaurV is offline   Reply With Quote
Old 2012-09-01, 17:03   #1892
lycorn
 
lycorn's Avatar
 
"GIMFS"
Sep 2002
Oeiras, Portugal

2×11×67 Posts
Default

Quote:
Originally Posted by James Heinrich View Post
No, (at least for myself) I'm referring to looking for small factors on large exponents.
OK, got it.

@LaurV: Would you be so kind as to sending me a Win7 64 bit exe, in case you have one? (Providing Oliver doesn´t object to it). I am using a GTX560Ti (CC 2.1), and CUDA version 4.2. I would like to give it a go from time to time, just for kicks.
If it´s OK with you both, I´ll PM you an email address.
Thx
lycorn is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1676 2021-06-30 21:23
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 32 2020-11-11 19:56
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 08:21.


Fri Aug 6 08:21:49 UTC 2021 up 14 days, 2:50, 1 user, load averages: 2.80, 2.44, 2.33

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.