![]() |
1 Attachment(s)
Windows 32-bit & 64-bit executables:
Tested. Ctrl+C works now (Thanks George! BTW-what lines did you change, I tried to find it and couldn't?) If anyone needs a different CUDA version, let me know. [CODE]F31 has a factor: 46931635677864055013377 [TF:75:76:mmff 0.24 mfaktc_barrett89_F32_63gs][/CODE] |
[QUOTE=flashjh;312401]Ctrl+C works now (Thanks George! BTW-what lines did you change, I tried to find it and couldn't?)[/QUOTE]
The very end of mfaktc.c -- I commented out the calls to destroy streams and sieve_free. |
[QUOTE=Prime95;312396]Version 0.24
Another minor upgrade. 1) Fixed bug in calculating which Fermat number is divisible by a found factor. 2) If GPUSievePrimes is not set in mmff.ini, then mmff chooses a default value based on each entry in worktodo.txt. It may not choose the optimal GPUSievePrimes value, but it should be in the ballpark. Report to me any instances where it chooses a wildly non-optimal setting. 3) The -st and -st2 mfaktc command line arguments (self-test) are now ignored. 4) Some uninitialized mfaktc CPU sieving pointers are no longer freed at exit. Maybe this will solve the crash some were reporting at exit. Because the mmff version number is written to save files, finish your current range before doing any upgrading or try the lightly-tested -nocheck command line argument to force using an old save file. Sources:[/QUOTE] One thing I've noticed is that the %g variable is returning huge values (at least it does for me on my computer). e.g. 19071091533368053000000000000000000000.00 |
[QUOTE=mognuts;312423]One thing I've noticed is that the %g variable is returning huge values (at least it does for me on my computer).
e.g. 19071091533368053000000000000000000000.00[/QUOTE] I need a bit more information to investigate this. |
I tried to run the known fermat factors again, but I got this error on 2 of them:
[CODE]got assignment: k*2^45+1, k range 111310000000000 to 111320000000000 (92-bit fac tors) Starting trial factoring of k*2^45+1 in k range: 111310G to 111320G (92-bit fact ors) k_min = 111309999999660 k_max = 111320000000000 Using GPU kernel "mfaktc_barrett96_F32_63gs" ERROR: GPU sieve problems. Factor divisible by 29 got assignment: k*2^45+1, k range 111310000000000 to 111320000000000 (92-bit fac tors) Starting trial factoring of k*2^45+1 in k range: 111310G to 111320G (92-bit fact ors) k_min = 111309999999660 k_max = 111320000000000 Using GPU kernel "mfaktc_barrett96_F32_63gs" ERROR: GPU sieve problems. Factor divisible by 41 got assignment: k*2^54+1, k range 81900000000000 to 81911000000000 (101-bit fact ors) Starting trial factoring of k*2^54+1 in k range: 81900G to 81911G (101-bit facto rs) k_min = 81899999998740 k_max = 81911000000000 Using GPU kernel "mfaktc_barrett108_F32_63gs" ERROR: GPU sieve problems. Factor divisible by 17 got assignment: k*2^54+1, k range 81900000000000 to 81911000000000 (101-bit fact ors) Starting trial factoring of k*2^54+1 in k range: 81900G to 81911G (101-bit facto rs) k_min = 81899999998740 k_max = 81911000000000 Using GPU kernel "mfaktc_barrett108_F32_63gs" ERROR: GPU sieve problems. Factor divisible by 13 [/CODE] I wonder if it's my card since it's not the same prime in the error every time? |
[QUOTE=ATH;312430]I wonder if it's my card since it's not the same prime in the error every time?[/QUOTE]
I wouldn't conclude that. Is this the Windows or Linux build? I reran finding the known Fermat factors last night before uploading the source. Did you set GPUSievePrimes in mmff.ini or try the new auto-select feature? |
It's a windows 64bit build.
I think I found the problem, it happens when there is 2 or more assignment in worktodo.txt on the same n but using different GPU kernels. For example FermatFactor=63,88,89 FermatFactor=63,89,90 uses first "mfaktc_barrett89_F32_63gs" then "mfaktc_barrett96_F32_63gs" for the 2nd line. It's not all kernel transitions but most of them. Here is a list I started with the transitions and whether or not the problem occurs and then an example of the 2 lines in worktodo.txt. There are 62 more transitions to test which I can do if its needed. EDIT: This seems to be an issue with auto-selecting GPUSievePrimes as it disappears when I set it. [CODE]mfaktc_barrett89_F0_31gs to mfaktc_barrett96_F0_31gs ERROR: GPU sieve problems FermatFactor=31,200000000e9,200000001e9 FermatFactor=31,300000000e9,300000001e9 mfaktc_barrett96_F0_31gs to mfaktc_barrett89_F0_31gs ERROR: GPU sieve problems FermatFactor=31,300000000e9,300000001e9 FermatFactor=31,200000000e9,200000001e9 mfaktc_barrett89_F32_63gs to mfaktc_barrett96_F32_63gs ERROR: GPU sieve problems FermatFactor=63,88,89 FermatFactor=63,89,90 mfaktc_barrett89_F32_63gs to mfaktc_barrett108_F32_63gs ERROR: GPU sieve problems FermatFactor=63,88,89 FermatFactor=63,96,97 mfaktc_barrett89_F32_63gs to mfaktc_barrett120_F32_63gs ERROR: GPU sieve problems FermatFactor=63,88,89 FermatFactor=63,40000e9,40001e9 mfaktc_barrett89_F32_63gs to mfaktc_barrett128_F32_63gs ERROR: GPU sieve problems FermatFactor=63,88,89 FermatFactor=63,200000000e9,200000001e9 mfaktc_barrett96_F32_63gs to mfaktc_barrett89_F32_63gs ERROR: GPU sieve problems FermatFactor=63,89,90 FermatFactor=63,88,89 mfaktc_barrett96_F32_63gs to mfaktc_barrett108_F32_63gs no error FermatFactor=63,89,90 FermatFactor=63,96,97 mfaktc_barrett96_F32_63gs to mfaktc_barrett120_F32_63gs no error FermatFactor=63,89,90 FermatFactor=63,40000e9,40001e9 mfaktc_barrett96_F32_63gs to mfaktc_barrett128_F32_63gs ERROR: GPU sieve problems FermatFactor=63,89,90 FermatFactor=63,200000000e9,200000001e9 mfaktc_barrett108_F32_63gs to mfaktc_barrett89_F32_63gs ERROR: GPU sieve problems FermatFactor=63,96,97 FermatFactor=63,88,89 mfaktc_barrett108_F32_63gs to mfaktc_barrett96_F32_63gs no error FermatFactor=63,96,97 FermatFactor=63,95,96 mfaktc_barrett108_F32_63gs to mfaktc_barrett120_F32_63gs no error FermatFactor=63,96,97 FermatFactor=63,40000e9,40001e9 mfaktc_barrett108_F32_63gs to mfaktc_barrett128_F32_63gs ERROR: GPU sieve problems FermatFactor=63,96,97 FermatFactor=63,200000000e9,200000001e9 mfaktc_barrett120_F32_63gs to mfaktc_barrett89_F32_63gs ERROR: GPU sieve problems FermatFactor=63,40000e9,40001e9 FermatFactor=63,88,89 mfaktc_barrett120_F32_63gs to mfaktc_barrett96_F32_63gs no error FermatFactor=63,40000e9,40001e9 FermatFactor=63,89,90 mfaktc_barrett120_F32_63gs to mfaktc_barrett108_F32_63gs no error FermatFactor=63,40000e9,40001e9 FermatFactor=63,96,97 mfaktc_barrett120_F32_63gs to mfaktc_barrett128_F32_63gs ERROR: GPU sieve problems FermatFactor=63,40000e9,40001e9 FermatFactor=63,200000000e9,200000001e9 mfaktc_barrett128_F32_63gs to mfaktc_barrett89_F32_63gs ERROR: GPU sieve problems FermatFactor=63,200000000e9,200000001e9 FermatFactor=63,88,89 mfaktc_barrett128_F32_63gs to mfaktc_barrett96_F32_63gs ERROR: GPU sieve problems FermatFactor=63,200000000e9,200000001e9 FermatFactor=63,89,90 mfaktc_barrett128_F32_63gs to mfaktc_barrett108_F32_63gs ERROR: GPU sieve problems FermatFactor=63,200000000e9,200000001e9 FermatFactor=63,96,97 mfaktc_barrett128_F32_63gs to mfaktc_barrett120_F32_63gs ERROR: GPU sieve problems FermatFactor=63,200000000e9,200000001e9 FermatFactor=63,40000e9,40001e9[/CODE] |
While testing I have also run in to the error:
ERROR: Exponentiation falure (yes there is a typo in 'failure') I get it with GPUSievePrimes off (auto-selecting) and this worktodo.txt: FermatFactor=96,128,129 FermatFactor=97,128,129 FermatFactor=98,128,129 FermatFactor=99,128,129 FermatFactor=100,128,129 But I know I got this error before 0.24 and auto-select feature but it seems very elusive and hard to track down and reproduce. |
[QUOTE=ATH;312430]I wonder if it's my card since it's not the same prime in the error every time?[/QUOTE]
Clarification. The GPU sieve does not give reproducible results. For performance reasons, bits are cleared from the sieve without using atomic operations. Thus, there are race conditions where two threads try to clear different bits in the same byte. This causes us to test a few more trial factors then necessary, but is more than offset by the savings from not using atomic operations. |
[QUOTE=ATH;312451]
I think I found the problem, it happens when there is 2 or more assignment in worktodo.txt on the same n but using different GPU kernels. [/QUOTE] I have a fix for this. If you can reproduce the exponentiation failure with the -v 3 command line argument that might be helpful. I could not reproduce the trouble. If I get timely feedback on the %g problem, I'd like to get that fixed in 0.25 too. |
[QUOTE=ATH;312454]While testing I have also run in to the error:
ERROR: Exponentiation falure (yes there is a typo in 'failure') I get it with GPUSievePrimes off (auto-selecting) and this worktodo.txt: FermatFactor=96,128,129 FermatFactor=97,128,129 FermatFactor=98,128,129 FermatFactor=99,128,129 FermatFactor=100,128,129 But I know I got this error before 0.24 and auto-select feature but it seems very elusive and hard to track down and reproduce.[/QUOTE] I reported a similar problem to George a week or two ago. There wasn't enough information to determine whether it was flaky hardware, the mmff software, or an NVIDIA runtime bug. I can reproduce the above error, which seems to rule out flaky hardware. I ran each of the above five assignments 11 times, and I got 3 failures of the "FermatFactor=96,128,129" assignment. [I Immediately restarted mmff after each failure.] [CODE]got assignment: k*2^96+1 bit_min=128 bit_max=129 Starting trial factoring k*2^96+1 from 2^128 to 2^129 k_min = 4294964520 k_max = 8589934592 Using GPU kernel "mfaktc_barrett140_F96_127gs" class | candidates | time | ETA | raw rate | SievePrimes | CPU wait ... <failure location not recorded> ERROR: Exponentiation falure ... 1575/4620 | 0.93M | 0.009s | n.a. | 103.77M/s | 349749 | n.a.% ERROR: Exponentiation falure ... 1575/4620 | 0.93M | 0.010s | n.a. | 93.39M/s | 349749 | n.a.% ERROR: Exponentiation falure ...[/CODE]Next, I put the failing assignment ("FermatFactor=96,128,129") in my worktodo.txt file 25 times. [CODE]... 1575/4620 | 0.93M | 0.009s | n.a. | 103.77M/s | 349749 | n.a.% ERROR: Exponentiation falure ... 1575/4620 | 0.93M | 0.008s | n.a. | 116.74M/s | 349749 | n.a.% ERROR: Exponentiation falure[/CODE]After two failures, I changed the command line to specify a different GPU. [CODE] ... 1575/4620 | 0.93M | 0.024s | n.a. | 38.91M/s | 349749 | n.a.% ERROR: Exponentiation falure ... 1575/4620 | 0.93M | 0.013s | n.a. | 71.84M/s | 349749 | n.a.% ERROR: Exponentiation falure ... 1575/4620 | 0.93M | 0.018s | n.a. | 51.88M/s | 349749 | n.a.% ERROR: Exponentiation falure ... 1575/4620 | 0.93M | 0.013s | n.a. | 71.84M/s | 349749 | n.a.% ERROR: Exponentiation falure ... 1575/4620 | 0.93M | 0.016s | n.a. | 58.37M/s | 349749 | n.a.% ERROR: Exponentiation falure[/CODE]10 failures out of 46 runs on two different GPUs. Perhaps this is sufficiently reproducible to find the problem. One more failure with -v3: [CODE]1573/4620 | 0.93M | 0.009s | n.a. | 103.77M/s | 349749 | n.a.% Verifying (2^(2^96)) % 340581321636451875144725492967785103361 = 202753569648208169353731391108513369608 1575/4620 | 0.93M | 0.009s | n.a. | 103.77M/s | 349749 | n.a.% Verifying (2^(2^96)) % 340282272560196908974548533520923361281 = 213505026821406843026269288964103298839037 ERROR: Exponentiation falure[/CODE] Note that the expected result is about 3 digits longer than the modulus. |
[QUOTE=rcv;312470]Note that the expected result is about 3 digits longer than the modulus.[/QUOTE]
and the factor is less than 2^128... |
[QUOTE=Prime95;312468]If you can reproduce the exponentiation failure with the -v 3 command line argument that might be helpful. I could not reproduce the trouble.[/QUOTE]
First one with GPUSievePrimes off (auto-select): [URL="http://www.hoegge.dk/mersenne/falure1.txt"]falure1.txt[/URL] FermatFactor=96,128,129 FermatFactor=97,128,129 FermatFactor=98,128,129 FermatFactor=99,128,129 FermatFactor=100,128,129 Second one with GPUSievePrimes=650000 (optimal), same worktodo.txt: [URL="http://www.hoegge.dk/mersenne/falure2.txt"]falure2.txt[/URL] Third one with GPUSievePrimes=100000 (too low, optimal ~ 950k): [URL="http://www.hoegge.dk/mersenne/falure3.txt"]falure3.txt[/URL] FermatFactor=140,171,172 FermatFactor=151,182,183 FermatFactor=153,184,185 FermatFactor=156,187,188 |
[QUOTE=ATH;312430]I tried to run the known fermat factors again, but I got this error on 2 of them:
[CODE]got assignment: k*2^45+1, k range 111310000000000 to 111320000000000 (92-bit fac tors) Starting trial factoring of k*2^45+1 in k range: 111310G to 111320G (92-bit fact ors) k_min = 111309999999660 k_max = 111320000000000 Using GPU kernel "mfaktc_barrett96_F32_63gs" ERROR: GPU sieve problems. Factor divisible by 29 got assignment: k*2^45+1, k range 111310000000000 to 111320000000000 (92-bit fac tors) Starting trial factoring of k*2^45+1 in k range: 111310G to 111320G (92-bit fact ors) k_min = 111309999999660 k_max = 111320000000000 Using GPU kernel "mfaktc_barrett96_F32_63gs" ERROR: GPU sieve problems. Factor divisible by 41 got assignment: k*2^54+1, k range 81900000000000 to 81911000000000 (101-bit fact ors) Starting trial factoring of k*2^54+1 in k range: 81900G to 81911G (101-bit facto rs) k_min = 81899999998740 k_max = 81911000000000 Using GPU kernel "mfaktc_barrett108_F32_63gs" ERROR: GPU sieve problems. Factor divisible by 17 got assignment: k*2^54+1, k range 81900000000000 to 81911000000000 (101-bit fact ors) Starting trial factoring of k*2^54+1 in k range: 81900G to 81911G (101-bit facto rs) k_min = 81899999998740 k_max = 81911000000000 Using GPU kernel "mfaktc_barrett108_F32_63gs" ERROR: GPU sieve problems. Factor divisible by 13 [/CODE] I wonder if it's my card since it's not the same prime in the error every time?[/QUOTE] I don't know if this is related... I had the same error trying to run mmff (v2.0) on a cc1.3 card. Did you modify your CUDA drivers/settings? Luigi |
This may have nothing to do with it, but I noticed new nVidia drivers were available recently (I have not upgraded mine yet). Could they be part of the cause?
|
No, this is the new GPUSievePrimes auto-select feature causing it, because if I set it manually, the bug disappears, and I didn't have it in version 0.23 on the same known fermat factors.
|
Here is the worktodo.txt to test the 21 known fermat factors that are within mmff's "search space" along with how the results.txt should look. I recommend setting PrintMode=1 in mmff.ini when you run this to avoid all the spam, and until version 0.25 is out you need to set GPUSievePrimes to something like 200000 to avoid the auto-select feature. This worktodo.txt takes 1min20sec on a GTX 460.
[CODE]worktodo.txt: FermatFactor=36,25709e6,25710e6 FermatFactor=33,5460e9,5470e9 FermatFactor=39,69,70 FermatFactor=45,11131e10,11132e10 FermatFactor=45,212e9,213e9 FermatFactor=50,2139e9,2140e9 FermatFactor=54,78,79 FermatFactor=54,81900e9,81911e9 FermatFactor=74,100,101 FermatFactor=79,5e9,6e9 FermatFactor=87,1595e9,1596e9 FermatFactor=88,20018e9,20019e9 FermatFactor=90,119e9,120e9 FermatFactor=92,198e9,199e9 FermatFactor=97,482e9,483e9 FermatFactor=101,3334e9,3335e9 FermatFactor=111,141,142 FermatFactor=120,3e9,4e9 FermatFactor=135,880e8,881e8 FermatFactor=148,173,174 FermatFactor=149,175,176 results.txt F28 has a factor: 1766730974551267606529 [TF:70:71:mmff 0.24 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^36+1 in k range: 25709M to 25710M (71-bit factors) [mmff 0.24 mfaktc_barrett89_F32_63gs] F31 has a factor: 46931635677864055013377 [TF:75:76:mmff 0.24 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^33+1 in k range: 5460G to 5470G (76-bit factors) [mmff 0.24 mfaktc_barrett89_F32_63gs] F37 has a factor: 701179711390136401921 [TF:69:70:mmff 0.24 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^39+1 in k range: 1073741824 to 2147483648 (70-bit factors) [mmff 0.24 mfaktc_barrett89_F32_63gs] F42 has a factor: 3916660235220715932328394753 [TF:91:92:mmff 0.24 mfaktc_barrett96_F32_63gs] found 1 factor for k*2^45+1 in k range: 111310G to 111320G (92-bit factors) [mmff 0.24 mfaktc_barrett96_F32_63gs] F43 has a factor: 7482850493766970889994241 [TF:82:83:mmff 0.24 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^45+1 in k range: 212G to 213G (83-bit factors) [mmff 0.24 mfaktc_barrett89_F32_63gs] F48 has a factor: 2408911986953445595315961857 [TF:90:91:mmff 0.24 mfaktc_barrett96_F32_63gs] found 1 factor for k*2^50+1 in k range: 2139G to 2140G (91-bit factors) [mmff 0.24 mfaktc_barrett96_F32_63gs] F52 has a factor: 389591181597081096683521 [TF:78:79:mmff 0.24 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^54+1 in k range: 16777216 to 33554432 (79-bit factors) [mmff 0.24 mfaktc_barrett89_F32_63gs] F52 has a factor: 1475547810493913550438096961537 [TF:100:101:mmff 0.24 mfaktc_barrett108_F32_63gs] found 1 factor for k*2^54+1 in k range: 81900G to 81911G (101-bit factors) [mmff 0.24 mfaktc_barrett108_F32_63gs] F72 has a factor: 1443765874709062348345951911937 [TF:100:101:mmff 0.24 mfaktc_barrett108_F64_95gs] found 1 factor for k*2^74+1 in k range: 67108864 to 134217728 (101-bit factors) [mmff 0.24 mfaktc_barrett108_F64_95gs] F77 has a factor: 3590715923977960355577974656860161 [TF:111:112:mmff 0.24 mfaktc_barrett120_F64_95gs] found 1 factor for k*2^79+1 in k range: 5G to 6G (112-bit factors) [mmff 0.24 mfaktc_barrett120_F64_95gs] F83 has a factor: 246947940268608417020015902258307792897 [TF:127:128:mmff 0.24 mfaktc_barrett128_F64_95gs] found 1 factor for k*2^87+1 in k range: 1595G to 1596G (128-bit factors) [mmff 0.24 mfaktc_barrett128_F64_95gs] F86 has a factor: 6195449970597928748332522715641578258433 [TF:132:133:mmff 0.24 mfaktc_barrett140_F64_95gs] found 1 factor for k*2^88+1 in k range: 20018G to 20019G (133-bit factors) [mmff 0.24 mfaktc_barrett140_F64_95gs] F88 has a factor: 148481934042154969241780501829489000449 [TF:126:127:mmff 0.24 mfaktc_barrett128_F64_95gs] found 1 factor for k*2^90+1 in k range: 119G to 120G (127-bit factors) [mmff 0.24 mfaktc_barrett128_F64_95gs] F90 has a factor: 985016348367230226078056532654006730753 [TF:129:130:mmff 0.24 mfaktc_barrett140_F64_95gs] found 1 factor for k*2^92+1 in k range: 198G to 199G (130-bit factors) [mmff 0.24 mfaktc_barrett140_F64_95gs] F94 has a factor: 76459067246115642538831634131564386844673 [TF:135:136:mmff 0.24 mfaktc_barrett140_F96_127gs] found 1 factor for k*2^97+1 in k range: 482G to 483G (136-bit factors) [mmff 0.24 mfaktc_barrett140_F96_127gs] F96 has a factor: 8453027931784477309850388309101819121893377 [TF:142:143:mmff 0.24 mfaktc_barrett152_F96_127gs] found 1 factor for k*2^101+1 in k range: 3334G to 3335G (143-bit factors) [mmff 0.24 mfaktc_barrett152_F96_127gs] F107 has a factor: 3346902437331832346018436558958369334886401 [TF:141:142:mmff 0.24 mfaktc_barrett152_F96_127gs] found 1 factor for k*2^111+1 in k range: 1073741824 to 2147483648 (142-bit factors) [mmff 0.24 mfaktc_barrett152_F96_127gs] F116 has a factor: 4563438810603420826872624280490561141381005313 [TF:151:152:mmff 0.24 mfaktc_barrett152_F96_127gs] found 1 factor for k*2^120+1 in k range: 3G to 4G (152-bit factors) [mmff 0.24 mfaktc_barrett152_F96_127gs] F133 has a factor: 3836232386548105510567872577199319351015739156856833 [TF:171:172:mmff 0.24 mfaktc_barrett172_F128_159gs] found 1 factor for k*2^135+1 in k range: 88000M to 88100M (172-bit factors) [mmff 0.24 mfaktc_barrett172_F128_159gs] F146 has a factor: 13235038053749721162769301995307025251972223086886913 [TF:173:174:mmff 0.24 mfaktc_barrett183_F128_159gs] found 1 factor for k*2^148+1 in k range: 33554432 to 67108864 (174-bit factors) [mmff 0.24 mfaktc_barrett183_F128_159gs] F147 has a factor: 88894220732640180500173831441107513117330143465963521 [TF:175:176:mmff 0.24 mfaktc_barrett183_F128_159gs] found 1 factor for k*2^149+1 in k range: 67108864 to 134217728 (176-bit factors) [mmff 0.24 mfaktc_barrett183_F128_159gs][/CODE] |
[QUOTE=bcp19;312510]This may have nothing to do with it, but I noticed new nVidia drivers were available recently (I have not upgraded mine yet). Could they be part of the cause?[/QUOTE]
306.23 is working fine for me so far. Performance is good. |
[QUOTE=Prime95;312428]I need a bit more information to investigate this.[/QUOTE]
I'm running Windows 7 64 bit with a GTX570 card. I have this in my mmff.ini file: [CODE] #GPUProgressHeader= class | candidates | time | ETA | raw rate | SievePrimes | CPU wait #ProgressFormat=%C/4620 | %n | %ts | %e | %rM/s | %s | %W%% # print everything GPUProgressHeader=[date time] exponent [TF bits]: percent class #, seq | GHz | time | ETA | #FCs | rate | SieveP. | CPU wait | V5UserID@ComputerID ProgressFormat=[%d %T] M%M [%l-%u]: %p%% %C/4620,%c/960 | %g | %ts | %e | %n | %rM/s | %s | %W%% | %U@%H [/CODE] [COLOR=black][FONT=Verdana].. and this in my worktodo.ini file:[/FONT][/COLOR] [COLOR=black][FONT=Verdana][CODE][COLOR=black][FONT=Verdana]MMFactor=127,4.0e15,4.3e15[/FONT][/COLOR][/CODE][/FONT][/COLOR] The %g variable generates values such as 19071091533368053000000000000000000000.00 It doesn't matter if the %g variable is on its own i.e. [CODE]ProgressFormat=%g[/CODE] or in a string with other variables such as the default string above. [COLOR=black][FONT=Verdana]All versions of mmff (from the first released one to the latest, or CUDA v4 or v5) behave simillarly.[/FONT][/COLOR] mognuts |
[QUOTE=mognuts;312539]
It doesn't matter if the %g variable is on its own i.e. [CODE]ProgressFormat=%g[/CODE] or in a string with other variables such as the default string above.[/QUOTE] Ah, I must admit I'm not familiar with all of mfaktc's output options. You may not like my solution though. Since Primenet does not track these results, there are no GHz-days of credit to be had. Thus, the %g option will now output "n.a.". |
1 Attachment(s)
Here we go again -- v 0.25:
This hopefully fixes many previously reported problems: 1) The problem reported with auto-select GPUSievePrimes is fixed. 2) The "exponentiation failure" bug where a tested k value is too small for the bit-level being worked on is fixed. This was done by no longer rounding the minimum k value down to a multiple of the number of classes. As far as I can tell, this rounding down was only necessary for the self-test code. A side effect of this change is that you cannot use v0.24 save files. Finish your current work before upgrading to v0.25. 3) Exponentiation failures from testing k values that are too large should be fixed. I've also decreased the minimum acceptable k values for Fermat testing. Although inefficient, it will allow us to have more known Fermat factors retested. 4) The funny %g outputs are gone. 5) Lots of bloat from mfaktc has been removed. The good news is none of these changes have been due to fundamental problems in the kernels doing the real work. My recommendation is to not upgrade until ATH, flashjh, and others have had time to try this version for a little bit. They have been quite effective in verifying the quality of recent releases. |
[QUOTE=Prime95;312590]Here we go again -- v 0.25:
[/QUOTE] I tried to compile it under Win 7 64 Bit with Visual Studio 2010. I only succeeded after changing the following line in mfaktc.c [CODE] extern int tf_class_barrett92_gs(unsigned long long int k_min, unsigned long long int k_max, mystuff_t *mystuff); [/CODE]to [CODE] #ifdef _MSC_VER extern "C" int tf_class_barrett92_gs(unsigned long long int k_min, unsigned long long int k_max, mystuff_t *mystuff); #else extern int tf_class_barrett92_gs(unsigned long long int k_min, unsigned long long int k_max, mystuff_t *mystuff); #endif [/CODE] |
binaries? flashjh? :smile:
|
One tiny (and unimportant) issue:
Program output: [CODE]WARNING: Read GPUSieveSize=1 from mmff.ini, using min value (4)[/CODE]From mmff.ini: [CODE]# GPUSieveSize defines how big a GPU sieve to use (in Mbits). Bigger sieves are a little # more efficient, but may produce laggy video response. # # [COLOR=Red][B]Minimum: GPUSieveSize=1[/B][/COLOR][/CODE]Increasing GPUSieveSize to the max gives has no noticeable effect on the screen lag*. The raw rate increases (on a GTX 470 at 607 MHz): [CODE]got assignment: k*2^28+1, k range 1000000000000000 to 1100000000000000 (78-bit factors) Starting trial factoring of k*2^28+1 in k range: 1000T to 1100T (78-bit factors) k_min = 1000000000000000 k_max = 1100000000000000 Using GPU kernel "mfaktc_barrett89_F0_31gs" class | candidates | time | ETA | raw rate | SievePrimes | CPU wait 21/4620 | 21.65G | 13.875s | 3h40m | 1560.00M/s | 69941 [/CODE][CODE]got assignment: k*2^28+1, k range 1000000000000000 to 1100000000000000 (78-bit factors) Starting trial factoring of k*2^28+1 in k range: 1000T to 1100T (78-bit factors) k_min = 1000000000000000 k_max = 1100000000000000 Using GPU kernel "mfaktc_barrett89_F0_31gs" class | candidates | time | ETA | raw rate | SievePrimes | CPU wait 21/4620 | 21.65G | 13.009s | 3h27m | 1663.85M/s | 69941 [/CODE]On a GTX 460 (725 MHz factory overclocked) the raw rate increases from 1000 M/s to ca. 1045 M/s. *Debian 6.0 squeeze GNOME 2 Desktop / CUDA SDK 4.1 for Ubuntu 11.04 / libcudart using the libstdc++ from a self compiled gcc 4.5.x Compilation with the CUDA SDK 4.0 for Ubuntu 10.10 fails with an internal error in the nvopencc. |
[QUOTE=Prime95;312590]Here we go again -- v 0.25:[/QUOTE]
Thank you for all the time you spend on GIMPS/primenet and now also this program. [QUOTE=Prime95;312590]My recommendation is to not upgrade until ATH, flashjh, and others have had time to try this version for a little bit. They have been quite effective in verifying the quality of recent releases.[/QUOTE] Sorry I keep giving you more work :) |
1 Attachment(s)
[QUOTE=LaurV;312638]binaries? flashjh? :smile:[/QUOTE]
Windows 32-bit & 64-bit executables: |
[QUOTE=Prime95;312590]Here we go again -- v 0.25:
[...] My recommendation is to not upgrade until ATH, flashjh, and others have had time to try this version for a little bit. They have been quite effective in verifying the quality of recent releases.[/QUOTE] Are you going to release a Linux 64bits binary of v 0.25? Luigi |
Haven't found any issues with version 0.25 all my test cases works now :) Very nice.
It can now find 31 of the known fermat factors (remember PrintMode=1 in mmff.ini to avoid spam): [CODE]worktodo.txt FermatFactor=36,25709e6,25710e6 FermatFactor=33,5460e9,5470e9 FermatFactor=39,69,70 FermatFactor=45,11131e10,11132e10 FermatFactor=45,212e9,213e9 FermatFactor=50,2139e9,2140e9 FermatFactor=54,66,67 FermatFactor=54,78,79 FermatFactor=54,81900e9,81911e9 FermatFactor=61,67,68 FermatFactor=74,100,101 FermatFactor=77,98,99 FermatFactor=79,5e9,6e9 FermatFactor=87,1595e9,1596e9 FermatFactor=88,20018e9,20019e9 FermatFactor=90,119e9,120e9 FermatFactor=92,198e9,199e9 FermatFactor=93,103,104 FermatFactor=97,482e9,483e9 FermatFactor=101,3334e9,3335e9 FermatFactor=111,141,142 FermatFactor=120,3e9,4e9 FermatFactor=124,146,147 FermatFactor=127,129,130 FermatFactor=135,880e8,881e8 FermatFactor=145,167,168 FermatFactor=148,173,174 FermatFactor=149,160,161 FermatFactor=149,175,176 FermatFactor=154,166,167 FermatFactor=157,167,168 results.txt F28 has a factor: 1766730974551267606529 [TF:70:71:mmff 0.25 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^36+1 in k range: 25709M to 25710M (71-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs] F31 has a factor: 46931635677864055013377 [TF:75:76:mmff 0.25 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^33+1 in k range: 5460G to 5470G (76-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs] F37 has a factor: 701179711390136401921 [TF:69:70:mmff 0.25 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^39+1 in k range: 1073741824 to 2147483647 (70-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs] F42 has a factor: 3916660235220715932328394753 [TF:91:92:mmff 0.25 mfaktc_barrett96_F32_63gs] found 1 factor for k*2^45+1 in k range: 111310G to 111320G (92-bit factors) [mmff 0.25 mfaktc_barrett96_F32_63gs] F43 has a factor: 7482850493766970889994241 [TF:82:83:mmff 0.25 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^45+1 in k range: 212G to 213G (83-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs] F48 has a factor: 2408911986953445595315961857 [TF:90:91:mmff 0.25 mfaktc_barrett96_F32_63gs] found 1 factor for k*2^50+1 in k range: 2139G to 2140G (91-bit factors) [mmff 0.25 mfaktc_barrett96_F32_63gs] F52 has a factor: 74201307460556292097 [TF:66:67:mmff 0.25 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^54+1 in k range: 4096 to 8191 (67-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs] F52 has a factor: 389591181597081096683521 [TF:78:79:mmff 0.25 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^54+1 in k range: 16777216 to 33554431 (79-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs] F52 has a factor: 1475547810493913550438096961537 [TF:100:101:mmff 0.25 mfaktc_barrett108_F32_63gs] found 1 factor for k*2^54+1 in k range: 81900G to 81911G (101-bit factors) [mmff 0.25 mfaktc_barrett108_F32_63gs] F58 has a factor: 219055085875300925441 [TF:67:68:mmff 0.25 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^61+1 in k range: 64 to 127 (68-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs] F72 has a factor: 1443765874709062348345951911937 [TF:100:101:mmff 0.25 mfaktc_barrett108_F64_95gs] found 1 factor for k*2^74+1 in k range: 67108864 to 134217727 (101-bit factors) [mmff 0.25 mfaktc_barrett108_F64_95gs] F75 has a factor: 520961043404985083798310879233 [TF:98:99:mmff 0.25 mfaktc_barrett108_F64_95gs] found 1 factor for k*2^77+1 in k range: 2097152 to 4194303 (99-bit factors) [mmff 0.25 mfaktc_barrett108_F64_95gs] F77 has a factor: 3590715923977960355577974656860161 [TF:111:112:mmff 0.25 mfaktc_barrett120_F64_95gs] found 1 factor for k*2^79+1 in k range: 5G to 6G (112-bit factors) [mmff 0.25 mfaktc_barrett120_F64_95gs] F83 has a factor: 246947940268608417020015902258307792897 [TF:127:128:mmff 0.25 mfaktc_barrett128_F64_95gs] found 1 factor for k*2^87+1 in k range: 1595G to 1596G (128-bit factors) [mmff 0.25 mfaktc_barrett128_F64_95gs] F86 has a factor: 6195449970597928748332522715641578258433 [TF:132:133:mmff 0.25 mfaktc_barrett140_F64_95gs] found 1 factor for k*2^88+1 in k range: 20018G to 20019G (133-bit factors) [mmff 0.25 mfaktc_barrett140_F64_95gs] F88 has a factor: 148481934042154969241780501829489000449 [TF:126:127:mmff 0.25 mfaktc_barrett128_F64_95gs] found 1 factor for k*2^90+1 in k range: 119G to 120G (127-bit factors) [mmff 0.25 mfaktc_barrett128_F64_95gs] F90 has a factor: 985016348367230226078056532654006730753 [TF:129:130:mmff 0.25 mfaktc_barrett140_F64_95gs] found 1 factor for k*2^92+1 in k range: 198G to 199G (130-bit factors) [mmff 0.25 mfaktc_barrett140_F64_95gs] F91 has a factor: 14072902366596202965053244178433 [TF:103:104:mmff 0.25 mfaktc_barrett108_F64_95gs] found 1 factor for k*2^93+1 in k range: 1024 to 2047 (104-bit factors) [mmff 0.25 mfaktc_barrett108_F64_95gs] F94 has a factor: 76459067246115642538831634131564386844673 [TF:135:136:mmff 0.25 mfaktc_barrett140_F96_127gs] found 1 factor for k*2^97+1 in k range: 482G to 483G (136-bit factors) [mmff 0.25 mfaktc_barrett140_F96_127gs] F96 has a factor: 8453027931784477309850388309101819121893377 [TF:142:143:mmff 0.25 mfaktc_barrett152_F96_127gs] found 1 factor for k*2^101+1 in k range: 3334G to 3335G (143-bit factors) [mmff 0.25 mfaktc_barrett152_F96_127gs] F107 has a factor: 3346902437331832346018436558958369334886401 [TF:141:142:mmff 0.25 mfaktc_barrett152_F96_127gs] found 1 factor for k*2^111+1 in k range: 1073741824 to 2147483647 (142-bit factors) [mmff 0.25 mfaktc_barrett152_F96_127gs] F116 has a factor: 4563438810603420826872624280490561141381005313 [TF:151:152:mmff 0.25 mfaktc_barrett152_F96_127gs] found 1 factor for k*2^120+1 in k range: 3G to 4G (152-bit factors) [mmff 0.25 mfaktc_barrett152_F96_127gs] F122 has a factor: 111331351706159727817280425663664652445286401 [TF:146:147:mmff 0.25 mfaktc_barrett152_F96_127gs] found 1 factor for k*2^124+1 in k range: 4194304 to 8388607 (147-bit factors) [mmff 0.25 mfaktc_barrett152_F96_127gs] F125 has a factor: 850705917302346158658436518579420528641 [TF:129:130:mmff 0.25 mfaktc_barrett140_F96_127gs] found 1 factor for k*2^127+1 in k range: 4 to 7 (130-bit factors) [mmff 0.25 mfaktc_barrett140_F96_127gs] F133 has a factor: 3836232386548105510567872577199319351015739156856833 [TF:171:172:mmff 0.25 mfaktc_barrett172_F128_159gs] found 1 factor for k*2^135+1 in k range: 88000M to 88100M (172-bit factors) [mmff 0.25 mfaktc_barrett172_F128_159gs] F142 has a factor: 363618066009591119386121910507749518730588867002369 [TF:167:168:mmff 0.25 mfaktc_barrett172_F128_159gs] found 1 factor for k*2^145+1 in k range: 4194304 to 8388607 (168-bit factors) [mmff 0.25 mfaktc_barrett172_F128_159gs] F146 has a factor: 13235038053749721162769301995307025251972223086886913 [TF:173:174:mmff 0.25 mfaktc_barrett183_F128_159gs] found 1 factor for k*2^148+1 in k range: 33554432 to 67108863 (174-bit factors) [mmff 0.25 mfaktc_barrett183_F128_159gs] F147 has a factor: 2230074519853062314153571827264836150598041600001 [TF:160:161:mmff 0.25 mfaktc_barrett172_F128_159gs] found 1 factor for k*2^149+1 in k range: 2048 to 4095 (161-bit factors) [mmff 0.25 mfaktc_barrett172_F128_159gs] F147 has a factor: 88894220732640180500173831441107513117330143465963521 [TF:175:176:mmff 0.25 mfaktc_barrett183_F128_159gs] found 1 factor for k*2^149+1 in k range: 67108864 to 134217727 (176-bit factors) [mmff 0.25 mfaktc_barrett183_F128_159gs] F150 has a factor: 124204803210043452689216278205372864748572142206977 [TF:166:167:mmff 0.25 mfaktc_barrett172_F128_159gs] found 1 factor for k*2^154+1 in k range: 4096 to 8191 (167-bit factors) [mmff 0.25 mfaktc_barrett172_F128_159gs] F150 has a factor: 287733134849521512021350451441018219494761719398401 [TF:167:168:mmff 0.25 mfaktc_barrett172_F128_159gs] found 1 factor for k*2^157+1 in k range: 1024 to 2047 (168-bit factors) [mmff 0.25 mfaktc_barrett172_F128_159gs] [/CODE] |
1 Attachment(s)
Linux 64-bit build
|
[code]
no factor for F46 from 2^96 to 2^97 (k range: 300000000000000 to 325000000000000) [mmff 0.20mmff mfaktc_barrett108_F30_61gs] no factor for k*2^48+1 in k range: 325T to 350T (97-bit factors) [mmff 0.24 mfaktc_barrett108_F32_63gs] no factor for k*2^48+1 in k range: 350T to 375T (97-bit factors) [mmff 0.25 mfaktc_barrett108_F32_63gs] [/code] |
The mmff-GFN patch
1 Attachment(s)
Because I can only dream of reaching George's level of generosity, the least I can do is spread the wealth.
Here's the patch the will turn mmff-0.25 into one of the five binaries for the Generalized Fermat factor search (bases 3,5,6,10,12 instead of 2). Read the included README, I will not repeat it here. It goes without saying that (if mmff is beta) this patch is alpha. Run the included tests! Pick the k range and the base and good luck! The reservations may be quite tricky (by email) but they are roughly summarized [URL="http://www1.uni-hamburg.de/RRZ/W.Keller/GFNsrch.txt"]here[/URL]; you may want to go above 10e12 (I've covered or in the process of covering the gap below 10e12). The next stop will be xGFN (a[SUP]2[SUP]m[/SUP][/SUP] + b[SUP]2[SUP]m[/SUP][/SUP]). This search will be naturally slower, but could be fun, too! -S [B]EDIT: Caveat Emptor![/B] The mmff-gfn-0.25 binaries are [B]not [/B]capable of normal FermatFactoring. Keep the normal binary separately, and keep five GFN binaries separately - all in different folders. |
1 Attachment(s)
P.S. The patch is a cleaner solution and its use is preferred (e.g. it can be applied to mmff-0.24 or maybe to mmff-0.26 later, with minimal hassle), but for the benefit for those who would prefer the source (and for Windows builders), here's the patched source, as well.
[B]EDIT: Caveat Emptor![/B] The mmff-gfn-0.25 binaries are [B]not [/B]capable of normal FermatFactoring. Keep the normal binary separately, and keep five GFN binaries separately - all in different folders. The base-specific initializations are embedded into kernels, among other reasons - for speed. There are no switches. A future (xGFN) binary may be capable of all and any functions; if I will not think of a more elegant solution, there will be a sort of "fat binary" packing of a dozen of specialized kernels per each current kernel. |
[QUOTE=Batalov;312724](bases 3,5,6,10,12 instead of 2).[/QUOTE]
Any reason these bases were chosen, other than to check the same bases as Proth.exe? |
I thought that too. It is easy to extend technically, but how would we know new factors from old? And who would volunteer to keep the factors and limits? (FactorDB has a limit on size; no pasaran).
It is probably because of the [URL="http://www.ams.org/journals/mcom/1998-67-221/S0025-5718-98-00891-6/S0025-5718-98-00891-6.pdf"]Bjorn/Riesel[/URL] legacy (and earlier Riesel references [3,4] therein). |
BR98 also includes bases 7, 8, and 11. If the idea is to only eliminate potential primes then these should be excluded, but so should 3 and 5. Perhaps the idea was to include the smallest bases, 3 & 5, plus bases that can include primes up to 12. Just a guess. :piggie:
|
Anyhow, W.Keller is keen on dealing with reservations on the 42[COLOR=red]*[/COLOR] (a,b) pairs and the five a's altogether. So, even though I would like to reserve the 25<=m<=150, k<=1e13, but I can only do that with xGFNs included. (reimplement "pfgw -gxo" in mmff-xgfn)
So, I plan to make mmff-xgfn. It would make sense: With five exponentiations (2,3,5,7,11), we will learn all GFN (including 7 and 11!) and xGFN factors in one sweep (some modular linear combinations need to be done; that's all) GF'(3) and GF'(5) can be prime (after the /2)! [I]F'[/I][SUB]6[/SUB](3) = 1716841910146256242328924544641 is prime, hey, larger than F(4)! I also silently hope that every ~2000th found Proth prime at PrimeGrid could be a ticket, too. 2 found, ~1998 to go. ;-) _______ [COLOR=red]*[/COLOR][SIZE=1]gotta love the number![/SIZE] |
hi,
i have mmff v 0.25 for linux x64 and the k - range is missing in the results.txt is this a bug? an example: no factor for k*2^47+1 in (96-bit factors) [mmff 0.25 mfaktc_barrett96_F32_63gs] |
[QUOTE=lalera;313180]hi,
i have mmff v 0.25 for linux x64 and the k - range is missing in the results.txt is this a bug? an example: no factor for k*2^47+1 in (96-bit factors) [mmff 0.25 mfaktc_barrett96_F32_63gs][/QUOTE] looks like a bug to me. Can you post the worktodo.txt file that generated that output? |
[QUOTE=Prime95;313187]looks like a bug to me. Can you post the worktodo.txt file that generated that output?[/QUOTE]
FermatFactor=47,560e12,1000e12 |
[QUOTE=Prime95;313187]looks like a bug to me. Can you post the worktodo.txt file that generated that output?[/QUOTE]
i did now the following test: [CODE]worktodo: FermatFactor=36,25709e6,25710e6 FermatFactor=33,5460e9,5470e9 FermatFactor=39,69,70 FermatFactor=45,11131e10,11132e10 FermatFactor=45,212e9,213e9 FermatFactor=50,2139e9,2140e9 FermatFactor=54,78,79 FermatFactor=54,81900e9,81911e9 FermatFactor=74,100,101 FermatFactor=79,5e9,6e9 FermatFactor=87,1595e9,1596e9 FermatFactor=88,20018e9,20019e9 FermatFactor=90,119e9,120e9 FermatFactor=92,198e9,199e9 FermatFactor=97,482e9,483e9 FermatFactor=101,3334e9,3335e9 FermatFactor=111,141,142 FermatFactor=120,3e9,4e9 FermatFactor=135,880e8,881e8 FermatFactor=148,173,174 FermatFactor=149,175,176 results: F28 has a factor: 1766730974551267606529 [TF:70:71:mmff 0.25 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^36+1 in (71-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs] F31 has a factor: 46931635677864055013377 [TF:75:76:mmff 0.25 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^33+1 in (76-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs] F37 has a factor: 701179711390136401921 [TF:69:70:mmff 0.25 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^39+1 in (70-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs] F42 has a factor: 3916660235220715932328394753 [TF:91:92:mmff 0.25 mfaktc_barrett96_F32_63gs] found 1 factor for k*2^45+1 in (92-bit factors) [mmff 0.25 mfaktc_barrett96_F32_63gs] F43 has a factor: 7482850493766970889994241 [TF:82:83:mmff 0.25 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^45+1 in (83-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs] F48 has a factor: 2408911986953445595315961857 [TF:90:91:mmff 0.25 mfaktc_barrett96_F32_63gs] found 1 factor for k*2^50+1 in (91-bit factors) [mmff 0.25 mfaktc_barrett96_F32_63gs] F52 has a factor: 389591181597081096683521 [TF:78:79:mmff 0.25 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^54+1 in (79-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs] F52 has a factor: 1475547810493913550438096961537 [TF:100:101:mmff 0.25 mfaktc_barrett108_F32_63gs] found 1 factor for k*2^54+1 in (101-bit factors) [mmff 0.25 mfaktc_barrett108_F32_63gs] F72 has a factor: 1443765874709062348345951911937 [TF:100:101:mmff 0.25 mfaktc_barrett108_F64_95gs] found 1 factor for k*2^74+1 in (101-bit factors) [mmff 0.25 mfaktc_barrett108_F64_95gs] F77 has a factor: 3590715923977960355577974656860161 [TF:111:112:mmff 0.25 mfaktc_barrett120_F64_95gs] found 1 factor for k*2^79+1 in (112-bit factors) [mmff 0.25 mfaktc_barrett120_F64_95gs] F83 has a factor: 246947940268608417020015902258307792897 [TF:127:128:mmff 0.25 mfaktc_barrett128_F64_95gs] found 1 factor for k*2^87+1 in (128-bit factors) [mmff 0.25 mfaktc_barrett128_F64_95gs] F86 has a factor: 6195449970597928748332522715641578258433 [TF:132:133:mmff 0.25 mfaktc_barrett140_F64_95gs] found 1 factor for k*2^88+1 in (133-bit factors) [mmff 0.25 mfaktc_barrett140_F64_95gs] F88 has a factor: 148481934042154969241780501829489000449 [TF:126:127:mmff 0.25 mfaktc_barrett128_F64_95gs] found 1 factor for k*2^90+1 in (127-bit factors) [mmff 0.25 mfaktc_barrett128_F64_95gs] F90 has a factor: 985016348367230226078056532654006730753 [TF:129:130:mmff 0.25 mfaktc_barrett140_F64_95gs] found 1 factor for k*2^92+1 in (130-bit factors) [mmff 0.25 mfaktc_barrett140_F64_95gs] F94 has a factor: 76459067246115642538831634131564386844673 [TF:135:136:mmff 0.25 mfaktc_barrett140_F96_127gs] found 1 factor for k*2^97+1 in (136-bit factors) [mmff 0.25 mfaktc_barrett140_F96_127gs] F96 has a factor: 8453027931784477309850388309101819121893377 [TF:142:143:mmff 0.25 mfaktc_barrett152_F96_127gs] found 1 factor for k*2^101+1 in (143-bit factors) [mmff 0.25 mfaktc_barrett152_F96_127gs] F107 has a factor: 3346902437331832346018436558958369334886401 [TF:141:142:mmff 0.25 mfaktc_barrett152_F96_127gs] found 1 factor for k*2^111+1 in (142-bit factors) [mmff 0.25 mfaktc_barrett152_F96_127gs] F116 has a factor: 4563438810603420826872624280490561141381005313 [TF:151:152:mmff 0.25 mfaktc_barrett152_F96_127gs] found 1 factor for k*2^120+1 in (152-bit factors) [mmff 0.25 mfaktc_barrett152_F96_127gs] F133 has a factor: 3836232386548105510567872577199319351015739156856833 [TF:171:172:mmff 0.25 mfaktc_barrett172_F128_159gs] found 1 factor for k*2^135+1 in (172-bit factors) [mmff 0.25 mfaktc_barrett172_F128_159gs] F146 has a factor: 13235038053749721162769301995307025251972223086886913 [TF:173:174:mmff 0.25 mfaktc_barrett183_F128_159gs] found 1 factor for k*2^148+1 in (174-bit factors) [mmff 0.25 mfaktc_barrett183_F128_159gs] F147 has a factor: 88894220732640180500173831441107513117330143465963521 [TF:175:176:mmff 0.25 mfaktc_barrett183_F128_159gs] found 1 factor for k*2^149+1 in (176-bit factors) [mmff 0.25 mfaktc_barrett183_F128_159gs][/CODE] |
1 Attachment(s)
Minor update -- v 0.26:
What's new: 1) MM31 is now supported. 2) Lalera's "missing k range" output bug is probably fixed. |
1 Attachment(s)
Linux 64-bit executable:
|
hi,
mmff v0.26 for linux x64 works fine i did the following test: [CODE]worktodo: FermatFactor=36,25709e6,25710e6 FermatFactor=33,5460e9,5470e9 FermatFactor=39,69,70 FermatFactor=45,11131e10,11132e10 FermatFactor=45,212e9,213e9 FermatFactor=50,2139e9,2140e9 FermatFactor=54,78,79 FermatFactor=54,81900e9,81911e9 FermatFactor=74,100,101 FermatFactor=79,5e9,6e9 FermatFactor=87,1595e9,1596e9 FermatFactor=88,20018e9,20019e9 FermatFactor=90,119e9,120e9 FermatFactor=92,198e9,199e9 FermatFactor=97,482e9,483e9 FermatFactor=101,3334e9,3335e9 FermatFactor=111,141,142 FermatFactor=120,3e9,4e9 FermatFactor=135,880e8,881e8 FermatFactor=148,173,174 FermatFactor=149,175,176 results: F28 has a factor: 1766730974551267606529 [TF:70:71:mmff 0.26 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^36+1 in k range: 25709M to 25710M (71-bit factors) [mmff 0.26 mfaktc_barrett89_F32_63gs] F31 has a factor: 46931635677864055013377 [TF:75:76:mmff 0.26 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^33+1 in k range: 5460G to 5470G (76-bit factors) [mmff 0.26 mfaktc_barrett89_F32_63gs] F37 has a factor: 701179711390136401921 [TF:69:70:mmff 0.26 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^39+1 in k range: 1073741824 to 2147483647 (70-bit factors) [mmff 0.26 mfaktc_barrett89_F32_63gs] F42 has a factor: 3916660235220715932328394753 [TF:91:92:mmff 0.26 mfaktc_barrett96_F32_63gs] found 1 factor for k*2^45+1 in k range: 111310G to 111320G (92-bit factors) [mmff 0.26 mfaktc_barrett96_F32_63gs] F43 has a factor: 7482850493766970889994241 [TF:82:83:mmff 0.26 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^45+1 in k range: 212G to 213G (83-bit factors) [mmff 0.26 mfaktc_barrett89_F32_63gs] F48 has a factor: 2408911986953445595315961857 [TF:90:91:mmff 0.26 mfaktc_barrett96_F32_63gs] found 1 factor for k*2^50+1 in k range: 2139G to 2140G (91-bit factors) [mmff 0.26 mfaktc_barrett96_F32_63gs] F52 has a factor: 389591181597081096683521 [TF:78:79:mmff 0.26 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^54+1 in k range: 16777216 to 33554431 (79-bit factors) [mmff 0.26 mfaktc_barrett89_F32_63gs] F52 has a factor: 1475547810493913550438096961537 [TF:100:101:mmff 0.26 mfaktc_barrett108_F32_63gs] found 1 factor for k*2^54+1 in k range: 81900G to 81911G (101-bit factors) [mmff 0.26 mfaktc_barrett108_F32_63gs] F72 has a factor: 1443765874709062348345951911937 [TF:100:101:mmff 0.26 mfaktc_barrett108_F64_95gs] found 1 factor for k*2^74+1 in k range: 67108864 to 134217727 (101-bit factors) [mmff 0.26 mfaktc_barrett108_F64_95gs] F77 has a factor: 3590715923977960355577974656860161 [TF:111:112:mmff 0.26 mfaktc_barrett120_F64_95gs] found 1 factor for k*2^79+1 in k range: 5G to 6G (112-bit factors) [mmff 0.26 mfaktc_barrett120_F64_95gs] F83 has a factor: 246947940268608417020015902258307792897 [TF:127:128:mmff 0.26 mfaktc_barrett128_F64_95gs] found 1 factor for k*2^87+1 in k range: 1595G to 1596G (128-bit factors) [mmff 0.26 mfaktc_barrett128_F64_95gs] F86 has a factor: 6195449970597928748332522715641578258433 [TF:132:133:mmff 0.26 mfaktc_barrett140_F64_95gs] found 1 factor for k*2^88+1 in k range: 20018G to 20019G (133-bit factors) [mmff 0.26 mfaktc_barrett140_F64_95gs] F88 has a factor: 148481934042154969241780501829489000449 [TF:126:127:mmff 0.26 mfaktc_barrett128_F64_95gs] found 1 factor for k*2^90+1 in k range: 119G to 120G (127-bit factors) [mmff 0.26 mfaktc_barrett128_F64_95gs] F90 has a factor: 985016348367230226078056532654006730753 [TF:129:130:mmff 0.26 mfaktc_barrett140_F64_95gs] found 1 factor for k*2^92+1 in k range: 198G to 199G (130-bit factors) [mmff 0.26 mfaktc_barrett140_F64_95gs] F94 has a factor: 76459067246115642538831634131564386844673 [TF:135:136:mmff 0.26 mfaktc_barrett140_F96_127gs] found 1 factor for k*2^97+1 in k range: 482G to 483G (136-bit factors) [mmff 0.26 mfaktc_barrett140_F96_127gs] F96 has a factor: 8453027931784477309850388309101819121893377 [TF:142:143:mmff 0.26 mfaktc_barrett152_F96_127gs] found 1 factor for k*2^101+1 in k range: 3334G to 3335G (143-bit factors) [mmff 0.26 mfaktc_barrett152_F96_127gs] F107 has a factor: 3346902437331832346018436558958369334886401 [TF:141:142:mmff 0.26 mfaktc_barrett152_F96_127gs] found 1 factor for k*2^111+1 in k range: 1073741824 to 2147483647 (142-bit factors) [mmff 0.26 mfaktc_barrett152_F96_127gs] F116 has a factor: 4563438810603420826872624280490561141381005313 [TF:151:152:mmff 0.26 mfaktc_barrett152_F96_127gs] found 1 factor for k*2^120+1 in k range: 3G to 4G (152-bit factors) [mmff 0.26 mfaktc_barrett152_F96_127gs] F133 has a factor: 3836232386548105510567872577199319351015739156856833 [TF:171:172:mmff 0.26 mfaktc_barrett172_F128_159gs] found 1 factor for k*2^135+1 in k range: 88000M to 88100M (172-bit factors) [mmff 0.26 mfaktc_barrett172_F128_159gs] F146 has a factor: 13235038053749721162769301995307025251972223086886913 [TF:173:174:mmff 0.26 mfaktc_barrett183_F128_159gs] found 1 factor for k*2^148+1 in k range: 33554432 to 67108863 (174-bit factors) [mmff 0.26 mfaktc_barrett183_F128_159gs] F147 has a factor: 88894220732640180500173831441107513117330143465963521 [TF:175:176:mmff 0.26 mfaktc_barrett183_F128_159gs] found 1 factor for k*2^149+1 in k range: 67108864 to 134217727 (176-bit factors) [mmff 0.26 mfaktc_barrett183_F128_159gs][/CODE] |
[QUOTE=lalera;313180]hi,
i have mmff v 0.25 for linux x64 and the k - range is missing in the results.txt is this a bug? an example: no factor for k*2^47+1 in (96-bit factors) [mmff 0.25 mfaktc_barrett96_F32_63gs][/QUOTE] I had the same thing in mmff 0.23. See [url="http://mersenneforum.org/showpost.php?p=313086&postcount=38"]this post[/url]. Sorry, but I'm not going to re-run my eight day test. :-) |
Looking for a small FermatFactor range... ideally 8-12 hours long on a 560
edit : choosed it [code] FermatFactor=60,200e12,210e12 FermatFactor=60,210e12,220e12 FermatFactor=60,220e12,230e12 FermatFactor=60,230e12,240e12 FermatFactor=60,240e12,250e12 FermatFactor=60,250e12,260e12 FermatFactor=60,260e12,270e12 FermatFactor=60,270e12,280e12 FermatFactor=60,280e12,290e12 FermatFactor=60,290e12,300e12 [/code] -> FermatFactor=60,200e12,300e12 |
1 Attachment(s)
v.26 Windows 32-bit & 64-bit executables:
|
Upgrading Cuda 4.2 SDK
After upgrading Cuda 4.2 SDK 64-bit to the latest version my mmff0.26 runs more then twice as fast.
|
[QUOTE=aketilander;314027]After upgrading Cuda 4.2 SDK 64-bit to the latest version my mmff0.26 runs more then twice as fast.[/QUOTE]
Did you upgrade the video driver as well? That's what would impact performance rather than the SDK, which is basically a bunch of demo programs. Whoops! The SDK might have some needed DLLs. |
[QUOTE=tServo;314038]Did you upgrade the video driver as well? That's what would impact performance rather than the SDK, which is basically a bunch of demo programs.
Whoops! The SDK might have some needed DLLs.[/QUOTE] Sorry, maybe I was too fast to spread the happy news. mmff0.26 runs for a while and then I get the error-message: "ERROR: cudaGetLastError() returned 30: unknown error" When I restart the program everything runs just fine, BUT the speed is about half of what it was from the beginning. This is not a new error I had it before the upgrade as well. I have a second computer with the same videocard. The program runs on the higher speed without problems. So, anybody knows what is wrong? |
A minor issue, but with gcc 4.6.3, I have to move $(LDFLAGS) to the end in the final link command in the Makefile:
[CODE]../mmff.exe : $(COBJS) $(CUOBJS) $(LD) $^ -o $@ $(LDFLAGS) [/CODE] |
[QUOTE=aketilander;314082]mmff0.26 runs for a while and then I get the error-message: "ERROR: cudaGetLastError() returned 30: unknown error" When I restart the program everything runs just fine, BUT the speed is about half of what it was from the beginning.[/QUOTE]
I took down the core clock speed of the videocard a little bit and the problem seems to have disappeared. The program seems to be stable now. If the problem reappears I will of course report it here. |
1 Attachment(s)
The latest mmff-gfn-0.26.zip is attached. (Contains some latest minor changes and the 0.26 backbone.)
Jerry flashjh, could you please build the Win32+Win64 binaries for each BASE (edit tf_gfn.h and delete all objects between each buils) and email them to Xyzzy for organized storage? Thank you in advance. |
[QUOTE=Batalov;314366]The latest mmff-gfn-0.26.zip is attached. (Contains some latest minor changes and the 0.26 backbone.)
Jerry flashjh, could you please build the Win32+Win64 binaries for each BASE (edit tf_gfn.h and delete all objects between each buils) and email them to Xyzzy for organized storage? Thank you in advance.[/QUOTE] Thank you Serge. Is a version ready for xGF and with every search bundled in it just on course? Luigi |
[QUOTE=ET_;314379]Thank you Serge.
Is a version ready for xGF and with every search bundled in it just on course? Luigi[/QUOTE] No, haven't started yet with xGFN; also have some doubts - we can run out of registers. Over weekend, I possibly will try to build a toy one - using only 2[SUP]2[SUP]m[/SUP][/SUP] and 3[SUP]2[SUP]m[/SUP][/SUP] (which after linear combinations makes for F, GF(3), GF(6), GF(12), and xGF(2,3), xGF(2,9), xGF(3,4), xGF(3,8)). Ughhh, it is going to be ugly. And slow, too! |
Win32/64 binaries uploaded and email sent.
Jerry |
Upper limit of mmff 0.26 TF MM127
I have done some tests to see where the upper limit of mmff 0.26 is. If I look into the source code there is a kernal barrett188 which is suppost to cover TFs up to ^188 bits (that is k=1,152,921e12 for MM127), but when I try to run the program with k:s that large I get the error-message "Exponentiation failure". Using smaller k:s the upper limit seems to be around 420,000e12.
Just out of curiosity it would be interesting to know if this is due to limitations of my video card or if it is a limitation of the program as such? |
[QUOTE=aketilander;315304]I have done some tests to see where the upper limit of mmff 0.26 is. If I look into the source code there is a kernal barrett188 which is suppost to cover TFs up to ^188 bits (that is k=1,152,921e12 for MM127), but when I try to run the program with k:s that large I get the error-message "Exponentiation failure". Using smaller k:s the upper limit seems to be around 420,000e12.
Just out of curiosity it would be interesting to know if this is due to limitations of my video card or if it is a limitation of the program as such?[/QUOTE] Sounds like a program bug where I've miscalculated the upper limit of what the kernel can handle. At least the automatic QA caught the problem. |
[QUOTE=aketilander;315304]I have done some tests to see where the upper limit of mmff 0.26 is. If I look into the source code there is a kernal barrett188 which is suppost to cover TFs up to ^188 bits (that is k=1,152,921e12 for MM127), but when I try to run the program with k:s that large I get the error-message "Exponentiation failure". Using smaller k:s the upper limit seems to be around 420,000e12.
Just out of curiosity it would be interesting to know if this is due to limitations of my video card or if it is a limitation of the program as such?[/QUOTE] I tested with ranges of 1e10 and it seems the higher the k the higher risk of the error happening but sometimes you don't get the error. The lowest I have got it at was k=280,000e12. Here is 3 instances of the error with the -v 3 option: [URL="http://www.hoegge.dk/mersenne/MM127error.txt"]MM127error.txt[/URL] |
I tried running a small range for the first time with mmff. So far I have only been troubleshooting because I think the fan on my GTX 460 might break soon and because I can't really afford a higher power bill than I have currently (I have complained a lot about danish power prices in other threads, so I'll spare the details here).
But to my surprise my small range finished 5x quicker than I expected, so it turns out I apparantly never understood the "raw rate" output of mmff. My "raw rate" was around 368 M/s (and later dropped to 347 M/s), so I figured this range would take: (1200e12-2^50)/368e6 = 201e3 sec = 56h. But it finished after 12h4min! So my k-rate was (1200e12-2^50)/43440sec = 1706 M/s ? So I noticed each output line says "candidates 16.04G" and took rougly 43-46s so thats where the "raw rate" of 347M/s - 368M/s comes from, but what is that measuring exactly? Candidates is number of probable primes to test divisibility with? [CODE]Starting trial factoring of k*2^46+1 in k range: 1125899906842624 to 12000000000 00000 (97-bit factors) k_min = 1125899906842624 k_max = 1200000000000000 Using GPU kernel "mfaktc_barrett108_F32_63gs" class | candidates | time | ETA | raw rate | SievePrimes | CPU wait 3/4620 | 16.04G | 43.589s | 11h36m | 367.96M/s | 210485 . . 4613/4620 | 16.04G | 46.216s | 0m00s | 347.04M/s | 210485 no factor for k*2^46+1 in k range: 1125899906842624 to 1200000000000000 (97-bit factors) [mmff 0.26 mfaktc_barrett108_F32_63gs] tf(): total time spent: 12h 4m 0.423s[/CODE] |
[QUOTE=ATH;315507]
My "raw rate" was around 368 M/s (and later dropped to 347 M/s), so I figured this range would take: (1200e12-2^50)/368e6 = 201e3 sec = 56h. But it finished after 12h4min! So my k-rate was (1200e12-2^50)/43440sec = 1706 M/s ? So I noticed each output line says "candidates 16.04G" and took rougly 43-46s so thats where the "raw rate" of 347M/s - 368M/s comes from, but what is that measuring exactly? Candidates is number of probable primes to test divisibility with? [CODE]Starting trial factoring of k*2^46+1 in k range: 1125899906842624 to 12000000000 00000 (97-bit factors) k_min = 1125899906842624 k_max = 1200000000000000 Using GPU kernel "mfaktc_barrett108_F32_63gs" class | candidates | time | ETA | raw rate | SievePrimes | CPU wait 3/4620 | 16.04G | 43.589s | 11h36m | 367.96M/s | 210485 . . 4613/4620 | 16.04G | 46.216s | 0m00s | 347.04M/s | 210485 no factor for k*2^46+1 in k range: 1125899906842624 to 1200000000000000 (97-bit factors) [mmff 0.26 mfaktc_barrett108_F32_63gs] tf(): total time spent: 12h 4m 0.423s[/CODE][/QUOTE] The sieving eliminates k-candidates whose factor q is divisible by small primes. The rest are then trial-divided into the various Fermat numbers; judging by your k-count ETA, it seems that only a 5th of the potential q's escaped the small-prime-sieve whole. The raw rate, then, is how fast the card is trial dividing the remaining q. (If you used the [URL="http://www.mersenneforum.org/showpost.php?p=304626&postcount=1809"]compile time option[/URL] to [URL="http://www.mersenneforum.org/showpost.php?p=307238&postcount=1845"]disable the sieve[/URL], then the card would trial-divide every candidate, and the run would probably be about as long as you originally estimated.) |
[QUOTE=ATH;315328]I tested with ranges of 1e10 and it seems the higher the k the higher risk of the error happening but sometimes you don't get the error. The lowest I have got it at was k=280,000e12.
Here is 3 instances of the error with the -v 3 option: [URL="http://www.hoegge.dk/mersenne/MM127error.txt"]MM127error.txt[/URL][/QUOTE] That was an easy fix. Note that in your output mmff is testing 187-bit factors with the barrett185 kernel. I fixed the typo so that mmff uses the 188-bit kernel, fixed a typo in the never-before-tested barrett188 kernel, and it's good to go. I'll release v. 0.27 after looking at Batalov's work. In the meantime, do not look for factors of MM127 that are more than 185 bits. |
[QUOTE=ATH;315507]
My "raw rate" was around 368 M/s (and later dropped to 347 M/s), so I figured this range would take: (1200e12-2^50)/368e6 = 201e3 sec = 56h. But it finished after 12h4min! So my k-rate was (1200e12-2^50)/43440sec = 1706 M/s ?[/QUOTE] Raw rate refers to number of k's that passed the "classes test". Note that you only test a small number of the 4620 classes. Even classes and clases where factors are divisible by 3,5,7,11 are eliminated before doing any GPU sieving. Note that in mfaktc the column is called avg. rate and refers to the number of k's after the classes test AND after the CPU sieving. |
[QUOTE=Prime95;315528]That was an easy fix.[/QUOTE]
Excellent! Thank you for taking your time doing it! |
Factor
Hello everybody,
i have this result with MMFF: F39 has a factor: 304649306542939328584089601 [TF:87:88:mmff 0.26 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^44+1 in k range: 10000000000000 to 17592186044415 (88-bit factors) [mmff 0.26 mfaktc_barrett89_F32_63gs] That is 17 317 308 137 475*2^44+1 but Fermat.exe find no factor. Any idea ? |
[QUOTE=f11ksx;315628]Hello everybody,
i have this result with MMFF: F39 has a factor: 304649306542939328584089601 [TF:87:88:mmff 0.26 mfaktc_barrett89_F32_63gs] found 1 factor for k*2^44+1 in k range: 10000000000000 to 17592186044415 (88-bit factors) [mmff 0.26 mfaktc_barrett89_F32_63gs] That is 17 317 308 137 475*2^44+1 but Fermat.exe find no factor. Any idea ?[/QUOTE] It is a composite factor. 304649306542939328584089601 = (3*2^41+1)*(21*2^41+1) 3*2^24+1 divides F38. 21*2^41+1 divides F39. Hmmm... How come the composite factor divides F39? |
Nah, it's ok; I've seen this with mmff-gfn.
The exit criterion from factoring (repeated squaring) is either -1 modulus or 1 (probably in case that -1 went unnoticed). So, when mod is 1, then it gets obviously repeated forever. A factor (a product of two prime factors that divide two different Fm values) can get through like that. Because of the implementation details, pfgw -gxo will get equally confused (try it!), but it will produce a less misleading answer. Actually, pfgw is not fooled by this number in -go mode, only in -gxo: [CODE]> pfgw -f -gxo -q"17317308137475*2^44+1" PFGW Version 3.4.6.64BIT.20110307.x86_Dev [GWNUM 26.5] A GF Factor was found, but the base of 12 may not be correct. 17317308137475*2^44+1 is a Factor of xGF(12,3,2)!!!! (0.000000 seconds) A GF Factor was found, but the base of 12 may not be correct. 17317308137475*2^44+1 is a Factor of xGF(12,4,3)!!!! (0.000000 seconds) A GF Factor was found, but the base of 12 may not be correct. 17317308137475*2^44+1 is a Factor of xGF(12,8,3)!!!! (0.000000 seconds) A GF Factor was found, but the base of 12 may not be correct. 17317308137475*2^44+1 is a Factor of xGF(12,9,2)!!!! (0.000000 seconds) A GF Factor was found, but the base of 12 may not be correct. 17317308137475*2^44+1 is a Factor of xGF(12,9,8)!!!! (0.000000 seconds) GFN testing completed [/CODE] |
[QUOTE=axn;315630]It is a composite factor. 304649306542939328584089601 = (3*2^41+1)*(21*2^41+1)
3*2^24+1 divides F38. 21*2^41+1 divides F39. Hmmm... How come the composite factor divides F39?[/QUOTE] May I suggest the insertion of a test to check if the factor is composite on mmff 0.27? Luigi |
[QUOTE=Batalov;315632]Nah, it's ok; I've seen this with mmff-gfn.
The exit criterion from factoring (repeated squaring) is either -1 modulus or 1 (probably in case that -1 went unnoticed). [/QUOTE] Gotcha. |
[QUOTE=ET_;315648]May I suggest the insertion of a test to check if the factor is composite on mmff 0.27?
Luigi[/QUOTE] Checking for compositeness only (without factoring) is probably fairly cheap to add even using dumnums. Factoring (when the binary representation of k appears very sparse) is another possibility: (s*2[SUP]m[/SUP]+1)*(t*2[SUP]m[/SUP]+1) = ([COLOR=blue]s*t*2[SUP]m[/SUP] + (s+t)[/COLOR])*2[SUP]m[/SUP]+1 = k*2[SUP]m[/SUP]+1 would be pretty easy to spot (with s and t small, s+t << 2[SUP]m[/SUP], though not necessarily both odd) Maybe we could add this to mmff 0.28? It is easy to do externally for now. P.S. Cheap demonstration for this particular k: > dc 2o 17317308137475p [COLOR=blue]111111[COLOR=black]000000000000000000000000000000000000[/COLOR]11[/COLOR] |
No Checkpoint/Restart ??
I've had to restart my mmff-0.26 run on MMFactor=127 but it doesn't appear to look for a .ckp file even though I have checkpoint turned on in the .ini file.
Is anyone else having this problem? |
On Linux
[QUOTE=RichD;315694]I've had to restart my mmff-0.26 run on MMFactor=127 but it doesn't appear to look for a .ckp file even though I have checkpoint turned on in the .ini file.[/QUOTE]
Luckily I spotted this as soon as it started and quickly renamed the .ckp file before quitting. Looking back through this thread I see I can use the -nocheck switch. It seems to be continuing OK. I first thought I may have lost 5-6 days... |
Checksum or residue?
The TFs covered by mmff are so large that there will never ever be possible to do a LL to check for compositeness. Therefore I would guess that someone would sooner or later want to do double checks at least when the double mersennes are concerned. Would it be possible to add a checksum or a residue to mmff? Maybe a summarized residue of all the TFs done in a region?
I don't think that mmff has a test in the beginning so sometimes, when you don't find factors you start to distrust your system. Did I OC too much? Are there hardware errors? In that case it would be nice to be able to DC a couple of regions just to make sure that your system works OK. So, would it be possible and meaningful to add some kind of checksum or residue to mmff? |
I have a question, will mmff client detect the number of GPU's available or not? I mean, should the number of mmff clients be started as many times as the number of GPU cards installed?
Thank you in advance, Carlos |
[QUOTE=pinhodecarlos;316867]I have a question, will mmff client detect the number of GPU's available or not? I mean, should the number of mmff clients be started as many times as the number of GPU cards installed?
Thank you in advance, Carlos[/QUOTE] You can run mmff on multiple GPUs. Use the -d 0 or -d 1, etc. |
[QUOTE=aketilander;316853]Would it be possible to add a checksum or a residue to mmff? [/QUOTE]
I don't see how to do this. Even if the second run used the same GPUSievePrimes value, the two runs would test a slightly different set of factors. This is because there are race conditions in the GPU sieve (it is faster to use non-atomic bit operations). |
Is there any way to get statistics on the error rate of the trial divison? E.g., what approximate fraction of factors are missed due to race conditions, or due to consumer-grade cards skipping a beat, etc.
|
[QUOTE=akruppa;317379]Is there any way to get statistics on the error rate of the trial divison? E.g., what approximate fraction of factors are missed due to race conditions, or due to consumer-grade cards skipping a beat, etc.[/QUOTE]
I have no statistics on the error rate. The program does check two of the trial divisions in each class. So, a couple of thousand are double-checked in each bit-level. Fortunately, there have been few reports of validation failures. However, the trial factors that are double-checked always come from thread id zero (of 256). If the CUDA scheduler always assigns this thread_id to specific CUDA cores, then a large number of CUDA cores are not getting any double-checking. As to race conditions, there shouldn't be any problems. The program intentionally has race conditions, but these only cause a few extra composite factors to be tested. |
I am wondering about the performance of different videocards. If I compare:
[URL]http://www.videocardbenchmark.net/high_end_gpus.html[/URL] with James' [URL]http://www.mersenne.ca/mfaktc.php[/URL] there seems to be large differences. Lets take GTX 680 as an example. According to the first site its on the top of the list, but according to James' its not so good. I can see that James' lists someting called "Compute 3.0". I am not sure if this refers to the speed of the PCI-e bus or if this is something else. What I am really wondering is if there is an unused potential of some videocards which may be better used in the future when mmff is better adapted to "Compute 3.0" or maybe something else? |
[QUOTE=aketilander;318528]I am wondering about the performance of different videocards. If I compare:
[URL]http://www.videocardbenchmark.net/high_end_gpus.html[/URL] with James' [URL]http://www.mersenne.ca/mfaktc.php[/URL] there seems to be large differences.[/QUOTE] For graphics performance (i.e. Games and stuff), 680 is great. For compute stuff (i.e. CUDA => mmff), not so much. I mean, it is still good, but not great. |
Basically 680 is more gaming oriented than the 580. Nvidia have just announced their next version of tesla/server gpus which is computation orientated. Probably the next generation of home cards(7xx) will be based on the same technology. I would guess that 8xx will be gaming mainly etc..
|
Hi All,
I finally got mmff going on my GPU. The error I had was cudart64_42_9.dll needed. from post # 185 of this thread, I found mmff v 23 with cuda 5.0, which seems to work. This seemed to solve my problem, but the windows 64 download from doublemersenne.org did not. I think this is because it has cuda 4.2. I guess my system requires cuda 5.0 and won't work with cuda 4.2. Regards, Matt |
[QUOTE=MattcAnderson;318727]Hi All,
I finally got mmff going on my GPU. The error I had was cudart64_42_9.dll needed. from post # 185 of this thread, I found mmff v 23 with cuda 5.0, which seems to work. This seemed to solve my problem, but the windows 64 download from doublemersenne.org did not. I think this is because it has cuda 4.2. I guess my system requires cuda 5.0 and won't work with cuda 4.2. Regards, Matt[/QUOTE] CUDA 4.2 version requires the 42 dll files [URL="http://sourceforge.net/projects/cudalucas/files/CUDA%20Libs/CUDA-4.2-Libs-Windows.7z/download"]here[/URL]. Just put them in the mmff directory. |
mmff doesn't compile...
1 Attachment(s)
Ubuntu 11.10 - Linux64, CUDA drivers 5.0, CUDA runtime version 4.1
[code] uigi@luigi-ubuntu:~$ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.6.1/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.6.1-9ubuntu3' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++,go --prefix=/usr --program-suffix=-4.6 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin --enable-objc-gc --disable-werror --with-arch-32=i686 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3) [/code] As mmff 0.26 binary is compiled with CUDA runtime version 4.2, I couldn't run it out of the box, and tried recompiling it, but... I had problems trying to compile mmff :sad: It seems that some classical C math functions don't link correctly, even moving the -lm switch on the Makefile at the beginning of the line. In any case, I am appending the compile log, hoping that someone of good will will throw an eye on it... Thank you anyway. Luigi |
The GNU linker by default processes object files and libraries in the order as they are specified on the command line, remembers which unresolved references there are from each file it processes, and tries to resolve those references with symbols in object files and libraries that [I]follow[/I]. This means putting -lm at the start of the command line has no effect: there are no unsatisfied references to math library functions at that point, and the unresolved references to libm from later files never get resolved because no -lm follows them.
Put -lm at the end, and generally a object/library file which contains functions that another file Y needs [I]after[/I] the file Y. If you have circular dependencies, you can use some command line switch (forgot which) to make gnu ld re-scan the objects on the command line until all symbols are resolved, but you rarely need that. |
[QUOTE=akruppa;320470]The GNU linker by default processes object files and libraries in the order as they are specified on the command line, remembers which unresolved references there are from each file it processes, and tries to resolve those references with symbols in object files and libraries that [I]follow[/I]. This means putting -lm at the start of the command line has no effect: there are no unsatisfied references to math library functions at that point, and the unresolved references to libm from later files never get resolved because no -lm follows them.
Put -lm at the end, and generally a object/library file which contains functions that another file Y needs [I]after[/I] the file Y. If you have circular dependencies, you can use some command line switch (forgot which) to make gnu ld re-scan the objects on the command line until all symbols are resolved, but you rarely need that.[/QUOTE] Thank you Alex. Update: I tried a make clean followed by a make all (I forgot the "all" thingie... :redface: ) result: [code] nvcc fatal : Unsupported gpu architecture 'compute_30' [/code] So I just cancelled the portion that said [code] --generate-code arch=compute_30,code=sm_30 [/code] as I successfully did for mfaktc. result: [code] #error -- unsupported GNU version! gcc 4.6 and up are not supported! [/code] Question: Does runtime version 5.0 correct this error, or I have to try some fancy link and download an older version of GCC? Luigi |
[QUOTE=ET_;320474]Question:
Does runtime version 5.0 correct this error, or I have to try some fancy link and download an older version of GCC? Luigi[/QUOTE] Good question. The answer is no. Find this file and edit __GNUC_MINOR__ > 6 into __GNUC_MINOR__ > 9 [CODE]/usr/local/cuda/include/host_config.h: #if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ > 9) [/CODE] |
1 Attachment(s)
[QUOTE=Batalov;320475]Good question. The answer is no.
Find this file and edit __GNUC_MINOR__ > 6 into __GNUC_MINOR__ > 9 [CODE]/usr/local/cuda/include/host_config.h: #if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ > 9) [/CODE][/QUOTE] It went farther, compiled the ptxtras (there is a __f variable set and unised), but still complains for the math functions. Log appended. There is a request to restart the system, maybe a pending update; I will tell you if after the reboot something changes. Thank you Serge. Luigi |
[code]
gcc -fPIC -L/usr/local/cuda/lib64/ [B]-lcudart -lm[/B] timer.o parse.o read_config.o mfaktc.o checkpoint.o signal_handler.o output.o tf_barrett96_gs.o gpusieve.o -o ../mmff.exe [/code] |
[QUOTE=akruppa;320480][code]
gcc -fPIC -L/usr/local/cuda/lib64/ [B]-lcudart -lm[/B] timer.o parse.o read_config.o mfaktc.o checkpoint.o signal_handler.o output.o tf_barrett96_gs.o gpusieve.o -o ../mmff.exe [/code][/QUOTE] That's the line come from George's Makefile. Should I modify it? Luigi |
Newer versions of gcc no longer support putting the libraries at the beginning. Try
[CODE]gcc -fPIC -L/usr/local/cuda/lib64/ -o ../mmff.exe timer.o parse.o read_config.o mfaktc.o checkpoint.o signal_handler.o output.o tf_barrett96_gs.o gpusieve.o -lcudart -lm[/CODE] |
I just checked what I did on our work machine to make it compile:
[CODE] kruppaal@quiche:~/mmff$ diff src/Makefile src.my/Makefile 14c14 < NVCCFLAGS += --generate-code arch=compute_20,code=sm_20 --generate-code arch=compute_30,code=sm_30 --- > NVCCFLAGS += --generate-code arch=compute_20,code=sm_20 # --generate-code arch=compute_30,code=sm_30 16c16 < NVCCFLAGS += --compiler-options=-Wall --- > NVCCFLAGS += --compiler-options="-Wall -B/usr/lib/gcc/x86_64-linux-gnu/4.5.3/" 20c20,21 < LDFLAGS = -fPIC $(CUDA_LIB) -lcudart -lm --- > LDFLAGS = -fPIC $(CUDA_LIB) > MMFFLIB = -lcudart -lm 35c36 < $(LD) $(LDFLAGS) $^ -o $@ --- > $(LD) $(LDFLAGS) $^ $(MMFFLIB) -o $@ [/CODE] The -B/usr/lib/gcc/x86_64-linux-gnu/4.5.3/ is an override because 4.6 is the default gcc, but 4.5.3 is installed as well; with this option, nvcc uses the 4.5.3 version and stops complaining. |
Thank you Greg, your line worked great! :bow:
Alex, now I found an old Makefile from the project GPU-ECM: [code] # CUDA does not support gcc >= 4.6 # this can be useful if your default gcc is >= 4.6 # if not just comment the following two lines. #CC_BIN:=/tmp/gcc45 #USE_THIS_CC:=--compiler-bindir $(CC_BIN) [/code] I totally forgot about it, but you were right: you (and I) had a lower version of GCC at that time. Thank you for yur last mssge, you gave me the clue to remember where I saw that message. It's time to reserve some exponent... My mmff is already doing its selftest. Thanks again! :hello: Luigi |
CUDA is not [I]supported[/I] with GCC >= 4.6, but it doesn't mean that it doesn't work. It means that if you write to them with a bug report, they will not take it. I use gcc version 4.7.1 20120723 [gcc-4_7-branch revision 189773] (SUSE Linux) and one odd thing that I had to add was -lstd++ (where something is expected to be defined); linker spat out this recommendation without me even asking. Weird, but the binaries work.
(Of course, something might not work, if they rely on some optimizations in some old way. More likely, they just don't want to be bothered.) |
1 Attachment(s)
Minor update -- v 0.27:
What's new: 1) Bug in testing 187-bit factors of MM127 fixed. 2) -lm makefile bug fixed (I hope). 3) With Batalov's help, the next set of 32 n values in k*2^n+1 Fermat factor testing is available. As always previous savefiles wiln not work with 0.27 unless the -nocheck argument is used. My dual boot box is running Windows right now, so I don't have a Linux executable right now. |
Under linux, I get some errors in or around
[CODE]gpusieve.cu(1991): error: expected a declaration that's around innocuously looking [B]else[/B] { if (gpusieve_initialized) return; } [/CODE] I'll diff to my branch when I get home. (it's ugly to figure out in vim over ssh.) |
[QUOTE=Batalov;321484]Under linux, I get some errors [/QUOTE]
Try again. I haven't compiled this code in well over a month. I minor tweak had a typo. |
Looks good.
Test passed: [CODE]got assignment: k*2^41+1, k range 2864929972000000 to 2864929973000000 (93-bit factors) Starting trial factoring of k*2^41+1 in k range: 2864929972M to 2864929973M (93-bit factors) k_min = 2864929972000000 k_max = 2864929973000000 Using GPU kernel "mfaktc_barrett96_F32_63gs" class | candidates | time | ETA | raw rate | SievePrimes | CPU wait 2471/4620 | 0.01M | 0.275s | 2m03s | 0.03M/s | 210485 F39 has a factor: 6300047635658008393597059073 [/CODE] |
| All times are UTC. The time now is 00:40. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.