![]() |
|
|
#1882 |
|
"Kieren"
Jul 2011
In My Own Galaxy!
2·3·1,693 Posts |
Doesn't shorter run time correspond to higher exponents?
|
|
|
|
|
|
#1883 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3×29×83 Posts |
|
|
|
|
|
|
#1884 |
|
Jul 2012
Saarland / Germany
6810 Posts |
OK, i´ve understand.
thank you guys ! Norman |
|
|
|
|
|
#1885 |
|
Romulan Interpreter
Jun 2011
Thailand
100101101111112 Posts |
How about a faster 67.13 bit kernel (whatever, but not more then 67 bits in factors, and faster
) and running for small expos (the same like the one distributed before to bcp, me, few others?). If you ever put it on your todo list, don't forget to PM me a link to it.[edit: and to be on topic: less classes versus normal: The end of the line is that, if you are nitpicker/pettifogger like me, you have to test and compare both versions for your particular system. The "less classes" version would be better for more ranges on a system having a low-end GPU and a top-class CPU (as Oliver said, it is more CPU intensive for sieving). When I did the 332M-333M ranges to 70 bits for Uncwilly, I tested them comparatively and I found out that in my system (heavily strangled by the CPU power, the i7-2600k can't keep up with all GPU's I have in it (usually 2 gtx580, occasionaly a third or a tesla), and to max the GPU's when I run mfaktc, I must run ONLY mfaktc; immediately after I start anything else, like P95, aliqueit, etc, then the GPUs occupancy goes down), so in my system, the "normal" version still performs much better for those expo ranges, over 65 bits [edit2: and 0.19 with its lower SievePrimes performs even BETTER]. In fact, the "less classes" version is faster under 65 bits, but it makes no sense to use it, as mfaktc will do (for this range) the 0-68 bits ALL-IN-ONE chunk, then another two chunks for 69 and 70 bits. tl;dr: if you plan to do mfaktc intensively, do a bit of tuning first. You may be surprised of what your system can do ![]() end of edit] Last fiddled with by LaurV on 2012-08-31 at 04:24 |
|
|
|
|
|
#1886 |
|
"James Heinrich"
May 2004
ex-Northern Ontario
23×149 Posts |
|
|
|
|
|
|
#1887 |
|
"GIMFS"
Sep 2002
Oeiras, Portugal
2×11×67 Posts |
|
|
|
|
|
|
#1888 | |
|
"James Heinrich"
May 2004
ex-Northern Ontario
23×149 Posts |
Quote:
|
|
|
|
|
|
|
#1889 |
|
Romulan Interpreter
Jun 2011
Thailand
3·3,221 Posts |
I was indeed talking about small exponents, see my post, the same version of software lycorn mentioned. That software was distributed to a "trusted" lot of crunchers (and I am proud that Oliver put me on that lot), and it was used to look for factors of mersenne numbers with exponents between 2K and 1M, from 60 to 65 bits. I personally took from 60 to 63 a series of exponents which had not so much ECM done on them. The version of that code based on mfaktc 0.18 which I have, I still use occasionally when some corner of the GPU is free. It is very slow, first of because 63 bits means a lot for those small expos (same amount of work like a 70-76 bit assignment for a LL-front exponent) and taking into account that we are not targeting bit-levels higher then - say - 65, then a lot of improvement could be done there too.
Unfortunately the biggest problem of that range is not the bit level, but the sieving process, you have to be careful how high you sieve the classes to avoid eliminating factors (never sieve with primes higher then 2*p, in fact the program never sieves with primes higher then p, this could be improved too, by selecting 2*p if p is 3 (mod 4) or 6*p if p is 1 (mod 4)) which lets behind a lot of candidates for exponentiation, then if the programmer is not careful, mfaktc can run into memory troubles. Handling those things is difficult and makes the program slower. Oliver did a lot of work to be able to lower the exponents so much (to 2k, instead of 1M like the default mfaktc). Using this version for "normal work" is much slower, it is only dedicated to guys who wanna waste their time looking for factors of mersenne numbers with small prime exponents (as I said already on the forum, this is somehow "wasting time": due to the amount of ECM done on that range, there would be no factors below 2^100 or so, remaining undiscovered. Our fun stays in raising the "how far factored" and eventually finding an "ecm miss"... well this never happened up to now, but it would be a nice headline!). What James is talking about is a different story. He wants to find all small factors very fast, for high exponents. I already sent him a list with all factors from 0 to 37 bits and expos from 0 to 10G (it took me almost one day to upload it on his server!), and I am on the way to add more bits to it, but the uploading process is very slow, and we are talking about many gigabytes of data. BTW, James, if you are only interested in exponents below 2^32 (4G29), then the current version (normal build) of mfaktc 0.19 can do this and is very fast. Last fiddled with by LaurV on 2012-09-01 at 07:52 |
|
|
|
|
|
#1890 |
|
"James Heinrich"
May 2004
ex-Northern Ontario
23·149 Posts |
Too fast. Even running 6 instances of mfaktc, and taking exponents up to "only" 2^64 I can't get above about 80% GPU usage, and throughput estimates are wildly all over the place (from 10 to 80GHz-day/day per instance, jumping like mad; a sure sign of inefficiency (lack of buffer or I don't know what, but certainly not optimal).
|
|
|
|
|
|
#1891 |
|
Romulan Interpreter
Jun 2011
Thailand
226778 Posts |
@Oliver: small cosmetic for 0.19 less classes version (I only tested win64): it still displays 4620 classes (like 0/4620, 1/4620 .... 419/4620). better check that compiler option against hard coded screen messages
Otherwise, it seems to work wonderfully well.
Last fiddled with by LaurV on 2012-09-01 at 16:15 |
|
|
|
|
|
#1892 | |
|
"GIMFS"
Sep 2002
Oeiras, Portugal
2×11×67 Posts |
Quote:
@LaurV: Would you be so kind as to sending me a Win7 64 bit exe, in case you have one? (Providing Oliver doesn´t object to it). I am using a GTX560Ti (CC 2.1), and CUDA version 4.2. I would like to give it a go from time to time, just for kicks. If it´s OK with you both, I´ll PM you an email address. Thx |
|
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1676 | 2021-06-30 21:23 |
| The P-1 factoring CUDA program | firejuggler | GPU Computing | 753 | 2020-12-12 18:07 |
| gr-mfaktc: a CUDA program for generalized repunits prefactoring | MrRepunit | GPU Computing | 32 | 2020-11-11 19:56 |
| mfaktc 0.21 - CUDA runtime wrong | keisentraut | Software | 2 | 2020-08-18 07:03 |
| World's second-dumbest CUDA program | fivemack | Programming | 112 | 2015-02-12 22:51 |