mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Operazione Doppi Mersennes (https://www.mersenneforum.org/forumdisplay.php?f=99)
-   -   Trial division with CUDA (mmff) -- used, but runs like new! (https://www.mersenneforum.org/showthread.php?t=17162)

flashjh 2012-09-22 04:27

1 Attachment(s)
Windows 32-bit & 64-bit executables:

Tested. Ctrl+C works now (Thanks George! BTW-what lines did you change, I tried to find it and couldn't?)

If anyone needs a different CUDA version, let me know.

[CODE]F31 has a factor: 46931635677864055013377 [TF:75:76:mmff 0.24 mfaktc_barrett89_F32_63gs][/CODE]

Prime95 2012-09-22 05:04

[QUOTE=flashjh;312401]Ctrl+C works now (Thanks George! BTW-what lines did you change, I tried to find it and couldn't?)[/QUOTE]

The very end of mfaktc.c -- I commented out the calls to destroy streams and sieve_free.

mognuts 2012-09-22 12:06

[QUOTE=Prime95;312396]Version 0.24

Another minor upgrade.

1) Fixed bug in calculating which Fermat number is divisible by a found factor.
2) If GPUSievePrimes is not set in mmff.ini, then mmff chooses a default value based on each entry in worktodo.txt. It may not choose the optimal GPUSievePrimes value, but it should be in the ballpark. Report to me any instances where it chooses a wildly non-optimal setting.
3) The -st and -st2 mfaktc command line arguments (self-test) are now ignored.
4) Some uninitialized mfaktc CPU sieving pointers are no longer freed at exit. Maybe this will solve the crash some were reporting at exit.

Because the mmff version number is written to save files, finish your current range before doing any upgrading or try the lightly-tested -nocheck command line argument to force using an old save file.

Sources:[/QUOTE]
One thing I've noticed is that the %g variable is returning huge values (at least it does for me on my computer).
e.g. 19071091533368053000000000000000000000.00

Prime95 2012-09-22 13:57

[QUOTE=mognuts;312423]One thing I've noticed is that the %g variable is returning huge values (at least it does for me on my computer).
e.g. 19071091533368053000000000000000000000.00[/QUOTE]

I need a bit more information to investigate this.

ATH 2012-09-22 15:02

I tried to run the known fermat factors again, but I got this error on 2 of them:

[CODE]got assignment: k*2^45+1, k range 111310000000000 to 111320000000000 (92-bit fac
tors)
Starting trial factoring of k*2^45+1 in k range: 111310G to 111320G (92-bit fact
ors)
k_min = 111309999999660
k_max = 111320000000000
Using GPU kernel "mfaktc_barrett96_F32_63gs"
ERROR: GPU sieve problems. Factor divisible by 29


got assignment: k*2^45+1, k range 111310000000000 to 111320000000000 (92-bit fac
tors)
Starting trial factoring of k*2^45+1 in k range: 111310G to 111320G (92-bit fact
ors)
k_min = 111309999999660
k_max = 111320000000000
Using GPU kernel "mfaktc_barrett96_F32_63gs"
ERROR: GPU sieve problems. Factor divisible by 41


got assignment: k*2^54+1, k range 81900000000000 to 81911000000000 (101-bit fact
ors)
Starting trial factoring of k*2^54+1 in k range: 81900G to 81911G (101-bit facto
rs)
k_min = 81899999998740
k_max = 81911000000000
Using GPU kernel "mfaktc_barrett108_F32_63gs"
ERROR: GPU sieve problems. Factor divisible by 17


got assignment: k*2^54+1, k range 81900000000000 to 81911000000000 (101-bit fact
ors)
Starting trial factoring of k*2^54+1 in k range: 81900G to 81911G (101-bit facto
rs)
k_min = 81899999998740
k_max = 81911000000000
Using GPU kernel "mfaktc_barrett108_F32_63gs"
ERROR: GPU sieve problems. Factor divisible by 13
[/CODE]

I wonder if it's my card since it's not the same prime in the error every time?

Prime95 2012-09-22 16:39

[QUOTE=ATH;312430]I wonder if it's my card since it's not the same prime in the error every time?[/QUOTE]

I wouldn't conclude that. Is this the Windows or Linux build? I reran finding the known Fermat factors last night before uploading the source.

Did you set GPUSievePrimes in mmff.ini or try the new auto-select feature?

ATH 2012-09-22 19:26

It's a windows 64bit build.

I think I found the problem, it happens when there is 2 or more assignment in worktodo.txt on the same n but using different GPU kernels. For example
FermatFactor=63,88,89
FermatFactor=63,89,90
uses first "mfaktc_barrett89_F32_63gs" then "mfaktc_barrett96_F32_63gs" for the 2nd line.

It's not all kernel transitions but most of them. Here is a list I started with the transitions and whether or not the problem occurs and then an example of the 2 lines in worktodo.txt.

There are 62 more transitions to test which I can do if its needed.

EDIT: This seems to be an issue with auto-selecting GPUSievePrimes as it disappears when I set it.

[CODE]mfaktc_barrett89_F0_31gs to mfaktc_barrett96_F0_31gs ERROR: GPU sieve problems
FermatFactor=31,200000000e9,200000001e9
FermatFactor=31,300000000e9,300000001e9

mfaktc_barrett96_F0_31gs to mfaktc_barrett89_F0_31gs ERROR: GPU sieve problems
FermatFactor=31,300000000e9,300000001e9
FermatFactor=31,200000000e9,200000001e9



mfaktc_barrett89_F32_63gs to mfaktc_barrett96_F32_63gs ERROR: GPU sieve problems
FermatFactor=63,88,89
FermatFactor=63,89,90

mfaktc_barrett89_F32_63gs to mfaktc_barrett108_F32_63gs ERROR: GPU sieve problems
FermatFactor=63,88,89
FermatFactor=63,96,97

mfaktc_barrett89_F32_63gs to mfaktc_barrett120_F32_63gs ERROR: GPU sieve problems
FermatFactor=63,88,89
FermatFactor=63,40000e9,40001e9

mfaktc_barrett89_F32_63gs to mfaktc_barrett128_F32_63gs ERROR: GPU sieve problems
FermatFactor=63,88,89
FermatFactor=63,200000000e9,200000001e9



mfaktc_barrett96_F32_63gs to mfaktc_barrett89_F32_63gs ERROR: GPU sieve problems
FermatFactor=63,89,90
FermatFactor=63,88,89

mfaktc_barrett96_F32_63gs to mfaktc_barrett108_F32_63gs no error
FermatFactor=63,89,90
FermatFactor=63,96,97

mfaktc_barrett96_F32_63gs to mfaktc_barrett120_F32_63gs no error
FermatFactor=63,89,90
FermatFactor=63,40000e9,40001e9

mfaktc_barrett96_F32_63gs to mfaktc_barrett128_F32_63gs ERROR: GPU sieve problems
FermatFactor=63,89,90
FermatFactor=63,200000000e9,200000001e9



mfaktc_barrett108_F32_63gs to mfaktc_barrett89_F32_63gs ERROR: GPU sieve problems
FermatFactor=63,96,97
FermatFactor=63,88,89

mfaktc_barrett108_F32_63gs to mfaktc_barrett96_F32_63gs no error
FermatFactor=63,96,97
FermatFactor=63,95,96

mfaktc_barrett108_F32_63gs to mfaktc_barrett120_F32_63gs no error
FermatFactor=63,96,97
FermatFactor=63,40000e9,40001e9

mfaktc_barrett108_F32_63gs to mfaktc_barrett128_F32_63gs ERROR: GPU sieve problems
FermatFactor=63,96,97
FermatFactor=63,200000000e9,200000001e9



mfaktc_barrett120_F32_63gs to mfaktc_barrett89_F32_63gs ERROR: GPU sieve problems
FermatFactor=63,40000e9,40001e9
FermatFactor=63,88,89

mfaktc_barrett120_F32_63gs to mfaktc_barrett96_F32_63gs no error
FermatFactor=63,40000e9,40001e9
FermatFactor=63,89,90

mfaktc_barrett120_F32_63gs to mfaktc_barrett108_F32_63gs no error
FermatFactor=63,40000e9,40001e9
FermatFactor=63,96,97

mfaktc_barrett120_F32_63gs to mfaktc_barrett128_F32_63gs ERROR: GPU sieve problems
FermatFactor=63,40000e9,40001e9
FermatFactor=63,200000000e9,200000001e9



mfaktc_barrett128_F32_63gs to mfaktc_barrett89_F32_63gs ERROR: GPU sieve problems
FermatFactor=63,200000000e9,200000001e9
FermatFactor=63,88,89

mfaktc_barrett128_F32_63gs to mfaktc_barrett96_F32_63gs ERROR: GPU sieve problems
FermatFactor=63,200000000e9,200000001e9
FermatFactor=63,89,90

mfaktc_barrett128_F32_63gs to mfaktc_barrett108_F32_63gs ERROR: GPU sieve problems
FermatFactor=63,200000000e9,200000001e9
FermatFactor=63,96,97

mfaktc_barrett128_F32_63gs to mfaktc_barrett120_F32_63gs ERROR: GPU sieve problems
FermatFactor=63,200000000e9,200000001e9
FermatFactor=63,40000e9,40001e9[/CODE]

ATH 2012-09-22 19:38

While testing I have also run in to the error:
ERROR: Exponentiation falure
(yes there is a typo in 'failure')

I get it with GPUSievePrimes off (auto-selecting) and this worktodo.txt:
FermatFactor=96,128,129
FermatFactor=97,128,129
FermatFactor=98,128,129
FermatFactor=99,128,129
FermatFactor=100,128,129

But I know I got this error before 0.24 and auto-select feature but it seems very elusive and hard to track down and reproduce.

Prime95 2012-09-22 20:03

[QUOTE=ATH;312430]I wonder if it's my card since it's not the same prime in the error every time?[/QUOTE]

Clarification. The GPU sieve does not give reproducible results. For performance reasons, bits are cleared from the sieve without using atomic operations. Thus, there are race conditions where two threads try to clear different bits in the same byte.

This causes us to test a few more trial factors then necessary, but is more than offset by the savings from not using atomic operations.

Prime95 2012-09-22 21:54

[QUOTE=ATH;312451]
I think I found the problem, it happens when there is 2 or more assignment in worktodo.txt on the same n but using different GPU kernels. [/QUOTE]

I have a fix for this. If you can reproduce the exponentiation failure with the -v 3 command line argument that might be helpful. I could not reproduce the trouble.

If I get timely feedback on the %g problem, I'd like to get that fixed in 0.25 too.

rcv 2012-09-22 22:05

[QUOTE=ATH;312454]While testing I have also run in to the error:
ERROR: Exponentiation falure
(yes there is a typo in 'failure')

I get it with GPUSievePrimes off (auto-selecting) and this worktodo.txt:
FermatFactor=96,128,129
FermatFactor=97,128,129
FermatFactor=98,128,129
FermatFactor=99,128,129
FermatFactor=100,128,129

But I know I got this error before 0.24 and auto-select feature but it seems very elusive and hard to track down and reproduce.[/QUOTE]
I reported a similar problem to George a week or two ago. There wasn't enough information to determine whether it was flaky hardware, the mmff software, or an NVIDIA runtime bug. I can reproduce the above error, which seems to rule out flaky hardware.

I ran each of the above five assignments 11 times, and I got 3 failures of the "FermatFactor=96,128,129" assignment. [I Immediately restarted mmff after each failure.]
[CODE]got assignment: k*2^96+1 bit_min=128 bit_max=129
Starting trial factoring k*2^96+1 from 2^128 to 2^129
k_min = 4294964520
k_max = 8589934592
Using GPU kernel "mfaktc_barrett140_F96_127gs"
class | candidates | time | ETA | raw rate | SievePrimes | CPU wait
...
<failure location not recorded>
ERROR: Exponentiation falure
...
1575/4620 | 0.93M | 0.009s | n.a. | 103.77M/s | 349749 | n.a.%
ERROR: Exponentiation falure
...
1575/4620 | 0.93M | 0.010s | n.a. | 93.39M/s | 349749 | n.a.%
ERROR: Exponentiation falure
...[/CODE]Next, I put the failing assignment ("FermatFactor=96,128,129") in my worktodo.txt file 25 times.
[CODE]...
1575/4620 | 0.93M | 0.009s | n.a. | 103.77M/s | 349749 | n.a.%
ERROR: Exponentiation falure
...
1575/4620 | 0.93M | 0.008s | n.a. | 116.74M/s | 349749 | n.a.%
ERROR: Exponentiation falure[/CODE]After two failures, I changed the command line to specify a different GPU.
[CODE] ...
1575/4620 | 0.93M | 0.024s | n.a. | 38.91M/s | 349749 | n.a.%
ERROR: Exponentiation falure
...
1575/4620 | 0.93M | 0.013s | n.a. | 71.84M/s | 349749 | n.a.%
ERROR: Exponentiation falure
...
1575/4620 | 0.93M | 0.018s | n.a. | 51.88M/s | 349749 | n.a.%
ERROR: Exponentiation falure
...
1575/4620 | 0.93M | 0.013s | n.a. | 71.84M/s | 349749 | n.a.%
ERROR: Exponentiation falure
...
1575/4620 | 0.93M | 0.016s | n.a. | 58.37M/s | 349749 | n.a.%
ERROR: Exponentiation falure[/CODE]10 failures out of 46 runs on two different GPUs. Perhaps this is sufficiently reproducible to find the problem.

One more failure with -v3:
[CODE]1573/4620 | 0.93M | 0.009s | n.a. | 103.77M/s | 349749 | n.a.%
Verifying (2^(2^96)) % 340581321636451875144725492967785103361 = 202753569648208169353731391108513369608
1575/4620 | 0.93M | 0.009s | n.a. | 103.77M/s | 349749 | n.a.%
Verifying (2^(2^96)) % 340282272560196908974548533520923361281 = 213505026821406843026269288964103298839037
ERROR: Exponentiation falure[/CODE]
Note that the expected result is about 3 digits longer than the modulus.

Prime95 2012-09-22 22:56

[QUOTE=rcv;312470]Note that the expected result is about 3 digits longer than the modulus.[/QUOTE]

and the factor is less than 2^128...

ATH 2012-09-23 01:14

[QUOTE=Prime95;312468]If you can reproduce the exponentiation failure with the -v 3 command line argument that might be helpful. I could not reproduce the trouble.[/QUOTE]

First one with GPUSievePrimes off (auto-select): [URL="http://www.hoegge.dk/mersenne/falure1.txt"]falure1.txt[/URL]
FermatFactor=96,128,129
FermatFactor=97,128,129
FermatFactor=98,128,129
FermatFactor=99,128,129
FermatFactor=100,128,129

Second one with GPUSievePrimes=650000 (optimal), same worktodo.txt: [URL="http://www.hoegge.dk/mersenne/falure2.txt"]falure2.txt[/URL]

Third one with GPUSievePrimes=100000 (too low, optimal ~ 950k): [URL="http://www.hoegge.dk/mersenne/falure3.txt"]falure3.txt[/URL]
FermatFactor=140,171,172
FermatFactor=151,182,183
FermatFactor=153,184,185
FermatFactor=156,187,188

ET_ 2012-09-23 10:08

[QUOTE=ATH;312430]I tried to run the known fermat factors again, but I got this error on 2 of them:

[CODE]got assignment: k*2^45+1, k range 111310000000000 to 111320000000000 (92-bit fac
tors)
Starting trial factoring of k*2^45+1 in k range: 111310G to 111320G (92-bit fact
ors)
k_min = 111309999999660
k_max = 111320000000000
Using GPU kernel "mfaktc_barrett96_F32_63gs"
ERROR: GPU sieve problems. Factor divisible by 29


got assignment: k*2^45+1, k range 111310000000000 to 111320000000000 (92-bit fac
tors)
Starting trial factoring of k*2^45+1 in k range: 111310G to 111320G (92-bit fact
ors)
k_min = 111309999999660
k_max = 111320000000000
Using GPU kernel "mfaktc_barrett96_F32_63gs"
ERROR: GPU sieve problems. Factor divisible by 41


got assignment: k*2^54+1, k range 81900000000000 to 81911000000000 (101-bit fact
ors)
Starting trial factoring of k*2^54+1 in k range: 81900G to 81911G (101-bit facto
rs)
k_min = 81899999998740
k_max = 81911000000000
Using GPU kernel "mfaktc_barrett108_F32_63gs"
ERROR: GPU sieve problems. Factor divisible by 17


got assignment: k*2^54+1, k range 81900000000000 to 81911000000000 (101-bit fact
ors)
Starting trial factoring of k*2^54+1 in k range: 81900G to 81911G (101-bit facto
rs)
k_min = 81899999998740
k_max = 81911000000000
Using GPU kernel "mfaktc_barrett108_F32_63gs"
ERROR: GPU sieve problems. Factor divisible by 13
[/CODE]

I wonder if it's my card since it's not the same prime in the error every time?[/QUOTE]

I don't know if this is related... I had the same error trying to run mmff (v2.0) on a cc1.3 card. Did you modify your CUDA drivers/settings?

Luigi

bcp19 2012-09-23 12:52

This may have nothing to do with it, but I noticed new nVidia drivers were available recently (I have not upgraded mine yet). Could they be part of the cause?

ATH 2012-09-23 13:43

No, this is the new GPUSievePrimes auto-select feature causing it, because if I set it manually, the bug disappears, and I didn't have it in version 0.23 on the same known fermat factors.

ATH 2012-09-23 14:11

Here is the worktodo.txt to test the 21 known fermat factors that are within mmff's "search space" along with how the results.txt should look. I recommend setting PrintMode=1 in mmff.ini when you run this to avoid all the spam, and until version 0.25 is out you need to set GPUSievePrimes to something like 200000 to avoid the auto-select feature. This worktodo.txt takes 1min20sec on a GTX 460.

[CODE]worktodo.txt:
FermatFactor=36,25709e6,25710e6
FermatFactor=33,5460e9,5470e9
FermatFactor=39,69,70
FermatFactor=45,11131e10,11132e10
FermatFactor=45,212e9,213e9
FermatFactor=50,2139e9,2140e9
FermatFactor=54,78,79
FermatFactor=54,81900e9,81911e9
FermatFactor=74,100,101
FermatFactor=79,5e9,6e9
FermatFactor=87,1595e9,1596e9
FermatFactor=88,20018e9,20019e9
FermatFactor=90,119e9,120e9
FermatFactor=92,198e9,199e9
FermatFactor=97,482e9,483e9
FermatFactor=101,3334e9,3335e9
FermatFactor=111,141,142
FermatFactor=120,3e9,4e9
FermatFactor=135,880e8,881e8
FermatFactor=148,173,174
FermatFactor=149,175,176

results.txt
F28 has a factor: 1766730974551267606529 [TF:70:71:mmff 0.24 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^36+1 in k range: 25709M to 25710M (71-bit factors) [mmff 0.24 mfaktc_barrett89_F32_63gs]
F31 has a factor: 46931635677864055013377 [TF:75:76:mmff 0.24 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^33+1 in k range: 5460G to 5470G (76-bit factors) [mmff 0.24 mfaktc_barrett89_F32_63gs]
F37 has a factor: 701179711390136401921 [TF:69:70:mmff 0.24 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^39+1 in k range: 1073741824 to 2147483648 (70-bit factors) [mmff 0.24 mfaktc_barrett89_F32_63gs]
F42 has a factor: 3916660235220715932328394753 [TF:91:92:mmff 0.24 mfaktc_barrett96_F32_63gs]
found 1 factor for k*2^45+1 in k range: 111310G to 111320G (92-bit factors) [mmff 0.24 mfaktc_barrett96_F32_63gs]
F43 has a factor: 7482850493766970889994241 [TF:82:83:mmff 0.24 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^45+1 in k range: 212G to 213G (83-bit factors) [mmff 0.24 mfaktc_barrett89_F32_63gs]
F48 has a factor: 2408911986953445595315961857 [TF:90:91:mmff 0.24 mfaktc_barrett96_F32_63gs]
found 1 factor for k*2^50+1 in k range: 2139G to 2140G (91-bit factors) [mmff 0.24 mfaktc_barrett96_F32_63gs]
F52 has a factor: 389591181597081096683521 [TF:78:79:mmff 0.24 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^54+1 in k range: 16777216 to 33554432 (79-bit factors) [mmff 0.24 mfaktc_barrett89_F32_63gs]
F52 has a factor: 1475547810493913550438096961537 [TF:100:101:mmff 0.24 mfaktc_barrett108_F32_63gs]
found 1 factor for k*2^54+1 in k range: 81900G to 81911G (101-bit factors) [mmff 0.24 mfaktc_barrett108_F32_63gs]
F72 has a factor: 1443765874709062348345951911937 [TF:100:101:mmff 0.24 mfaktc_barrett108_F64_95gs]
found 1 factor for k*2^74+1 in k range: 67108864 to 134217728 (101-bit factors) [mmff 0.24 mfaktc_barrett108_F64_95gs]
F77 has a factor: 3590715923977960355577974656860161 [TF:111:112:mmff 0.24 mfaktc_barrett120_F64_95gs]
found 1 factor for k*2^79+1 in k range: 5G to 6G (112-bit factors) [mmff 0.24 mfaktc_barrett120_F64_95gs]
F83 has a factor: 246947940268608417020015902258307792897 [TF:127:128:mmff 0.24 mfaktc_barrett128_F64_95gs]
found 1 factor for k*2^87+1 in k range: 1595G to 1596G (128-bit factors) [mmff 0.24 mfaktc_barrett128_F64_95gs]
F86 has a factor: 6195449970597928748332522715641578258433 [TF:132:133:mmff 0.24 mfaktc_barrett140_F64_95gs]
found 1 factor for k*2^88+1 in k range: 20018G to 20019G (133-bit factors) [mmff 0.24 mfaktc_barrett140_F64_95gs]
F88 has a factor: 148481934042154969241780501829489000449 [TF:126:127:mmff 0.24 mfaktc_barrett128_F64_95gs]
found 1 factor for k*2^90+1 in k range: 119G to 120G (127-bit factors) [mmff 0.24 mfaktc_barrett128_F64_95gs]
F90 has a factor: 985016348367230226078056532654006730753 [TF:129:130:mmff 0.24 mfaktc_barrett140_F64_95gs]
found 1 factor for k*2^92+1 in k range: 198G to 199G (130-bit factors) [mmff 0.24 mfaktc_barrett140_F64_95gs]
F94 has a factor: 76459067246115642538831634131564386844673 [TF:135:136:mmff 0.24 mfaktc_barrett140_F96_127gs]
found 1 factor for k*2^97+1 in k range: 482G to 483G (136-bit factors) [mmff 0.24 mfaktc_barrett140_F96_127gs]
F96 has a factor: 8453027931784477309850388309101819121893377 [TF:142:143:mmff 0.24 mfaktc_barrett152_F96_127gs]
found 1 factor for k*2^101+1 in k range: 3334G to 3335G (143-bit factors) [mmff 0.24 mfaktc_barrett152_F96_127gs]
F107 has a factor: 3346902437331832346018436558958369334886401 [TF:141:142:mmff 0.24 mfaktc_barrett152_F96_127gs]
found 1 factor for k*2^111+1 in k range: 1073741824 to 2147483648 (142-bit factors) [mmff 0.24 mfaktc_barrett152_F96_127gs]
F116 has a factor: 4563438810603420826872624280490561141381005313 [TF:151:152:mmff 0.24 mfaktc_barrett152_F96_127gs]
found 1 factor for k*2^120+1 in k range: 3G to 4G (152-bit factors) [mmff 0.24 mfaktc_barrett152_F96_127gs]
F133 has a factor: 3836232386548105510567872577199319351015739156856833 [TF:171:172:mmff 0.24 mfaktc_barrett172_F128_159gs]
found 1 factor for k*2^135+1 in k range: 88000M to 88100M (172-bit factors) [mmff 0.24 mfaktc_barrett172_F128_159gs]
F146 has a factor: 13235038053749721162769301995307025251972223086886913 [TF:173:174:mmff 0.24 mfaktc_barrett183_F128_159gs]
found 1 factor for k*2^148+1 in k range: 33554432 to 67108864 (174-bit factors) [mmff 0.24 mfaktc_barrett183_F128_159gs]
F147 has a factor: 88894220732640180500173831441107513117330143465963521 [TF:175:176:mmff 0.24 mfaktc_barrett183_F128_159gs]
found 1 factor for k*2^149+1 in k range: 67108864 to 134217728 (176-bit factors) [mmff 0.24 mfaktc_barrett183_F128_159gs][/CODE]

kladner 2012-09-23 15:40

[QUOTE=bcp19;312510]This may have nothing to do with it, but I noticed new nVidia drivers were available recently (I have not upgraded mine yet). Could they be part of the cause?[/QUOTE]

306.23 is working fine for me so far. Performance is good.

mognuts 2012-09-23 17:09

[QUOTE=Prime95;312428]I need a bit more information to investigate this.[/QUOTE]
I'm running Windows 7 64 bit with a GTX570 card.

I have this in my mmff.ini file:

[CODE]
#GPUProgressHeader= class | candidates | time | ETA | raw rate | SievePrimes | CPU wait
#ProgressFormat=%C/4620 | %n | %ts | %e | %rM/s | %s | %W%%

# print everything
GPUProgressHeader=[date time] exponent [TF bits]: percent class #, seq | GHz | time | ETA | #FCs | rate | SieveP. | CPU wait | V5UserID@ComputerID
ProgressFormat=[%d %T] M%M [%l-%u]: %p%% %C/4620,%c/960 | %g | %ts | %e | %n | %rM/s | %s | %W%% | %U@%H
[/CODE]
[COLOR=black][FONT=Verdana].. and this in my worktodo.ini file:[/FONT][/COLOR]
[COLOR=black][FONT=Verdana][CODE][COLOR=black][FONT=Verdana]MMFactor=127,4.0e15,4.3e15[/FONT][/COLOR][/CODE][/FONT][/COLOR]
The %g variable generates values such as 19071091533368053000000000000000000000.00

It doesn't matter if the %g variable is on its own i.e.
[CODE]ProgressFormat=%g[/CODE]
or in a string with other variables such as the default string above.

[COLOR=black][FONT=Verdana]All versions of mmff (from the first released one to the latest, or CUDA v4 or v5) behave simillarly.[/FONT][/COLOR]

mognuts

Prime95 2012-09-23 19:33

[QUOTE=mognuts;312539]
It doesn't matter if the %g variable is on its own i.e.
[CODE]ProgressFormat=%g[/CODE]
or in a string with other variables such as the default string above.[/QUOTE]

Ah, I must admit I'm not familiar with all of mfaktc's output options. You may not like my solution though. Since Primenet does not track these results, there are no GHz-days of credit to be had. Thus, the %g option will now output "n.a.".

Prime95 2012-09-24 02:03

1 Attachment(s)
Here we go again -- v 0.25:

This hopefully fixes many previously reported problems:

1) The problem reported with auto-select GPUSievePrimes is fixed.
2) The "exponentiation failure" bug where a tested k value is too small for the bit-level being worked on is fixed. This was done by no longer rounding the minimum k value down to a multiple of the number of classes. As far as I can tell, this rounding down was only necessary for the self-test code. A side effect of this change is that you cannot use v0.24 save files. Finish your current work before upgrading to v0.25.
3) Exponentiation failures from testing k values that are too large should be fixed. I've also decreased the minimum acceptable k values for Fermat testing. Although inefficient, it will allow us to have more known Fermat factors retested.
4) The funny %g outputs are gone.
5) Lots of bloat from mfaktc has been removed.

The good news is none of these changes have been due to fundamental problems in the kernels doing the real work.

My recommendation is to not upgrade until ATH, flashjh, and others have had time to try this version for a little bit. They have been quite effective in verifying the quality of recent releases.

MrRepunit 2012-09-24 11:59

[QUOTE=Prime95;312590]Here we go again -- v 0.25:
[/QUOTE]

I tried to compile it under Win 7 64 Bit with Visual Studio 2010. I only succeeded after changing the following line in mfaktc.c
[CODE]
extern int tf_class_barrett92_gs(unsigned long long int k_min, unsigned long long int k_max, mystuff_t *mystuff);
[/CODE]to
[CODE]
#ifdef _MSC_VER
extern "C" int tf_class_barrett92_gs(unsigned long long int k_min, unsigned long long int k_max, mystuff_t *mystuff);
#else
extern int tf_class_barrett92_gs(unsigned long long int k_min, unsigned long long int k_max, mystuff_t *mystuff);
#endif
[/CODE]

LaurV 2012-09-24 12:16

binaries? flashjh? :smile:

Ralf Recker 2012-09-24 14:03

One tiny (and unimportant) issue:

Program output:

[CODE]WARNING: Read GPUSieveSize=1 from mmff.ini, using min value (4)[/CODE]From mmff.ini:

[CODE]# GPUSieveSize defines how big a GPU sieve to use (in Mbits). Bigger sieves are a little
# more efficient, but may produce laggy video response.
#
# [COLOR=Red][B]Minimum: GPUSieveSize=1[/B][/COLOR][/CODE]Increasing GPUSieveSize to the max gives has no noticeable effect on the screen lag*. The raw rate increases (on a GTX 470 at 607 MHz):

[CODE]got assignment: k*2^28+1, k range 1000000000000000 to 1100000000000000 (78-bit factors)
Starting trial factoring of k*2^28+1 in k range: 1000T to 1100T (78-bit factors)
k_min = 1000000000000000
k_max = 1100000000000000
Using GPU kernel "mfaktc_barrett89_F0_31gs"
class | candidates | time | ETA | raw rate | SievePrimes | CPU wait
21/4620 | 21.65G | 13.875s | 3h40m | 1560.00M/s | 69941
[/CODE][CODE]got assignment: k*2^28+1, k range 1000000000000000 to 1100000000000000 (78-bit factors)
Starting trial factoring of k*2^28+1 in k range: 1000T to 1100T (78-bit factors)
k_min = 1000000000000000
k_max = 1100000000000000
Using GPU kernel "mfaktc_barrett89_F0_31gs"
class | candidates | time | ETA | raw rate | SievePrimes | CPU wait
21/4620 | 21.65G | 13.009s | 3h27m | 1663.85M/s | 69941
[/CODE]On a GTX 460 (725 MHz factory overclocked) the raw rate increases from 1000 M/s to ca. 1045 M/s.

*Debian 6.0 squeeze GNOME 2 Desktop / CUDA SDK 4.1 for Ubuntu 11.04 / libcudart using the libstdc++ from a self compiled gcc 4.5.x
Compilation with the CUDA SDK 4.0 for Ubuntu 10.10 fails with an internal error in the nvopencc.

ATH 2012-09-24 14:04

[QUOTE=Prime95;312590]Here we go again -- v 0.25:[/QUOTE]

Thank you for all the time you spend on GIMPS/primenet and now also this program.

[QUOTE=Prime95;312590]My recommendation is to not upgrade until ATH, flashjh, and others have had time to try this version for a little bit. They have been quite effective in verifying the quality of recent releases.[/QUOTE]

Sorry I keep giving you more work :)

flashjh 2012-09-24 18:39

1 Attachment(s)
[QUOTE=LaurV;312638]binaries? flashjh? :smile:[/QUOTE]

Windows 32-bit & 64-bit executables:

ET_ 2012-09-24 18:59

[QUOTE=Prime95;312590]Here we go again -- v 0.25:

[...]

My recommendation is to not upgrade until ATH, flashjh, and others have had time to try this version for a little bit. They have been quite effective in verifying the quality of recent releases.[/QUOTE]

Are you going to release a Linux 64bits binary of v 0.25?

Luigi

ATH 2012-09-24 22:45

Haven't found any issues with version 0.25 all my test cases works now :) Very nice.

It can now find 31 of the known fermat factors (remember PrintMode=1 in mmff.ini to avoid spam):

[CODE]worktodo.txt
FermatFactor=36,25709e6,25710e6
FermatFactor=33,5460e9,5470e9
FermatFactor=39,69,70
FermatFactor=45,11131e10,11132e10
FermatFactor=45,212e9,213e9
FermatFactor=50,2139e9,2140e9
FermatFactor=54,66,67
FermatFactor=54,78,79
FermatFactor=54,81900e9,81911e9
FermatFactor=61,67,68
FermatFactor=74,100,101
FermatFactor=77,98,99
FermatFactor=79,5e9,6e9
FermatFactor=87,1595e9,1596e9
FermatFactor=88,20018e9,20019e9
FermatFactor=90,119e9,120e9
FermatFactor=92,198e9,199e9
FermatFactor=93,103,104
FermatFactor=97,482e9,483e9
FermatFactor=101,3334e9,3335e9
FermatFactor=111,141,142
FermatFactor=120,3e9,4e9
FermatFactor=124,146,147
FermatFactor=127,129,130
FermatFactor=135,880e8,881e8
FermatFactor=145,167,168
FermatFactor=148,173,174
FermatFactor=149,160,161
FermatFactor=149,175,176
FermatFactor=154,166,167
FermatFactor=157,167,168

results.txt

F28 has a factor: 1766730974551267606529 [TF:70:71:mmff 0.25 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^36+1 in k range: 25709M to 25710M (71-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs]
F31 has a factor: 46931635677864055013377 [TF:75:76:mmff 0.25 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^33+1 in k range: 5460G to 5470G (76-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs]
F37 has a factor: 701179711390136401921 [TF:69:70:mmff 0.25 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^39+1 in k range: 1073741824 to 2147483647 (70-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs]
F42 has a factor: 3916660235220715932328394753 [TF:91:92:mmff 0.25 mfaktc_barrett96_F32_63gs]
found 1 factor for k*2^45+1 in k range: 111310G to 111320G (92-bit factors) [mmff 0.25 mfaktc_barrett96_F32_63gs]
F43 has a factor: 7482850493766970889994241 [TF:82:83:mmff 0.25 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^45+1 in k range: 212G to 213G (83-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs]
F48 has a factor: 2408911986953445595315961857 [TF:90:91:mmff 0.25 mfaktc_barrett96_F32_63gs]
found 1 factor for k*2^50+1 in k range: 2139G to 2140G (91-bit factors) [mmff 0.25 mfaktc_barrett96_F32_63gs]
F52 has a factor: 74201307460556292097 [TF:66:67:mmff 0.25 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^54+1 in k range: 4096 to 8191 (67-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs]
F52 has a factor: 389591181597081096683521 [TF:78:79:mmff 0.25 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^54+1 in k range: 16777216 to 33554431 (79-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs]
F52 has a factor: 1475547810493913550438096961537 [TF:100:101:mmff 0.25 mfaktc_barrett108_F32_63gs]
found 1 factor for k*2^54+1 in k range: 81900G to 81911G (101-bit factors) [mmff 0.25 mfaktc_barrett108_F32_63gs]
F58 has a factor: 219055085875300925441 [TF:67:68:mmff 0.25 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^61+1 in k range: 64 to 127 (68-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs]
F72 has a factor: 1443765874709062348345951911937 [TF:100:101:mmff 0.25 mfaktc_barrett108_F64_95gs]
found 1 factor for k*2^74+1 in k range: 67108864 to 134217727 (101-bit factors) [mmff 0.25 mfaktc_barrett108_F64_95gs]
F75 has a factor: 520961043404985083798310879233 [TF:98:99:mmff 0.25 mfaktc_barrett108_F64_95gs]
found 1 factor for k*2^77+1 in k range: 2097152 to 4194303 (99-bit factors) [mmff 0.25 mfaktc_barrett108_F64_95gs]
F77 has a factor: 3590715923977960355577974656860161 [TF:111:112:mmff 0.25 mfaktc_barrett120_F64_95gs]
found 1 factor for k*2^79+1 in k range: 5G to 6G (112-bit factors) [mmff 0.25 mfaktc_barrett120_F64_95gs]
F83 has a factor: 246947940268608417020015902258307792897 [TF:127:128:mmff 0.25 mfaktc_barrett128_F64_95gs]
found 1 factor for k*2^87+1 in k range: 1595G to 1596G (128-bit factors) [mmff 0.25 mfaktc_barrett128_F64_95gs]
F86 has a factor: 6195449970597928748332522715641578258433 [TF:132:133:mmff 0.25 mfaktc_barrett140_F64_95gs]
found 1 factor for k*2^88+1 in k range: 20018G to 20019G (133-bit factors) [mmff 0.25 mfaktc_barrett140_F64_95gs]
F88 has a factor: 148481934042154969241780501829489000449 [TF:126:127:mmff 0.25 mfaktc_barrett128_F64_95gs]
found 1 factor for k*2^90+1 in k range: 119G to 120G (127-bit factors) [mmff 0.25 mfaktc_barrett128_F64_95gs]
F90 has a factor: 985016348367230226078056532654006730753 [TF:129:130:mmff 0.25 mfaktc_barrett140_F64_95gs]
found 1 factor for k*2^92+1 in k range: 198G to 199G (130-bit factors) [mmff 0.25 mfaktc_barrett140_F64_95gs]
F91 has a factor: 14072902366596202965053244178433 [TF:103:104:mmff 0.25 mfaktc_barrett108_F64_95gs]
found 1 factor for k*2^93+1 in k range: 1024 to 2047 (104-bit factors) [mmff 0.25 mfaktc_barrett108_F64_95gs]
F94 has a factor: 76459067246115642538831634131564386844673 [TF:135:136:mmff 0.25 mfaktc_barrett140_F96_127gs]
found 1 factor for k*2^97+1 in k range: 482G to 483G (136-bit factors) [mmff 0.25 mfaktc_barrett140_F96_127gs]
F96 has a factor: 8453027931784477309850388309101819121893377 [TF:142:143:mmff 0.25 mfaktc_barrett152_F96_127gs]
found 1 factor for k*2^101+1 in k range: 3334G to 3335G (143-bit factors) [mmff 0.25 mfaktc_barrett152_F96_127gs]
F107 has a factor: 3346902437331832346018436558958369334886401 [TF:141:142:mmff 0.25 mfaktc_barrett152_F96_127gs]
found 1 factor for k*2^111+1 in k range: 1073741824 to 2147483647 (142-bit factors) [mmff 0.25 mfaktc_barrett152_F96_127gs]
F116 has a factor: 4563438810603420826872624280490561141381005313 [TF:151:152:mmff 0.25 mfaktc_barrett152_F96_127gs]
found 1 factor for k*2^120+1 in k range: 3G to 4G (152-bit factors) [mmff 0.25 mfaktc_barrett152_F96_127gs]
F122 has a factor: 111331351706159727817280425663664652445286401 [TF:146:147:mmff 0.25 mfaktc_barrett152_F96_127gs]
found 1 factor for k*2^124+1 in k range: 4194304 to 8388607 (147-bit factors) [mmff 0.25 mfaktc_barrett152_F96_127gs]
F125 has a factor: 850705917302346158658436518579420528641 [TF:129:130:mmff 0.25 mfaktc_barrett140_F96_127gs]
found 1 factor for k*2^127+1 in k range: 4 to 7 (130-bit factors) [mmff 0.25 mfaktc_barrett140_F96_127gs]
F133 has a factor: 3836232386548105510567872577199319351015739156856833 [TF:171:172:mmff 0.25 mfaktc_barrett172_F128_159gs]
found 1 factor for k*2^135+1 in k range: 88000M to 88100M (172-bit factors) [mmff 0.25 mfaktc_barrett172_F128_159gs]
F142 has a factor: 363618066009591119386121910507749518730588867002369 [TF:167:168:mmff 0.25 mfaktc_barrett172_F128_159gs]
found 1 factor for k*2^145+1 in k range: 4194304 to 8388607 (168-bit factors) [mmff 0.25 mfaktc_barrett172_F128_159gs]
F146 has a factor: 13235038053749721162769301995307025251972223086886913 [TF:173:174:mmff 0.25 mfaktc_barrett183_F128_159gs]
found 1 factor for k*2^148+1 in k range: 33554432 to 67108863 (174-bit factors) [mmff 0.25 mfaktc_barrett183_F128_159gs]
F147 has a factor: 2230074519853062314153571827264836150598041600001 [TF:160:161:mmff 0.25 mfaktc_barrett172_F128_159gs]
found 1 factor for k*2^149+1 in k range: 2048 to 4095 (161-bit factors) [mmff 0.25 mfaktc_barrett172_F128_159gs]
F147 has a factor: 88894220732640180500173831441107513117330143465963521 [TF:175:176:mmff 0.25 mfaktc_barrett183_F128_159gs]
found 1 factor for k*2^149+1 in k range: 67108864 to 134217727 (176-bit factors) [mmff 0.25 mfaktc_barrett183_F128_159gs]
F150 has a factor: 124204803210043452689216278205372864748572142206977 [TF:166:167:mmff 0.25 mfaktc_barrett172_F128_159gs]
found 1 factor for k*2^154+1 in k range: 4096 to 8191 (167-bit factors) [mmff 0.25 mfaktc_barrett172_F128_159gs]
F150 has a factor: 287733134849521512021350451441018219494761719398401 [TF:167:168:mmff 0.25 mfaktc_barrett172_F128_159gs]
found 1 factor for k*2^157+1 in k range: 1024 to 2047 (168-bit factors) [mmff 0.25 mfaktc_barrett172_F128_159gs]
[/CODE]

Prime95 2012-09-25 02:06

1 Attachment(s)
Linux 64-bit build

firejuggler 2012-09-25 03:50

[code]

no factor for F46 from 2^96 to 2^97 (k range: 300000000000000 to 325000000000000) [mmff 0.20mmff mfaktc_barrett108_F30_61gs]
no factor for k*2^48+1 in k range: 325T to 350T (97-bit factors) [mmff 0.24 mfaktc_barrett108_F32_63gs]
no factor for k*2^48+1 in k range: 350T to 375T (97-bit factors) [mmff 0.25 mfaktc_barrett108_F32_63gs]
[/code]

Batalov 2012-09-25 11:52

The mmff-GFN patch
 
1 Attachment(s)
Because I can only dream of reaching George's level of generosity, the least I can do is spread the wealth.

Here's the patch the will turn mmff-0.25 into one of the five binaries for the Generalized Fermat factor search (bases 3,5,6,10,12 instead of 2). Read the included README, I will not repeat it here. It goes without saying that (if mmff is beta) this patch is alpha. Run the included tests!

Pick the k range and the base and good luck!
The reservations may be quite tricky (by email) but they are roughly summarized [URL="http://www1.uni-hamburg.de/RRZ/W.Keller/GFNsrch.txt"]here[/URL]; you may want to go above 10e12 (I've covered or in the process of covering the gap below 10e12).

The next stop will be xGFN (a[SUP]2[SUP]m[/SUP][/SUP] + b[SUP]2[SUP]m[/SUP][/SUP]). This search will be naturally slower, but could be fun, too!

-S

[B]EDIT: Caveat Emptor![/B] The mmff-gfn-0.25 binaries are [B]not [/B]capable of normal FermatFactoring. Keep the normal binary separately, and keep five GFN binaries separately - all in different folders.

Batalov 2012-09-25 12:24

1 Attachment(s)
P.S. The patch is a cleaner solution and its use is preferred (e.g. it can be applied to mmff-0.24 or maybe to mmff-0.26 later, with minimal hassle), but for the benefit for those who would prefer the source (and for Windows builders), here's the patched source, as well.

[B]EDIT: Caveat Emptor![/B] The mmff-gfn-0.25 binaries are [B]not [/B]capable of normal FermatFactoring. Keep the normal binary separately, and keep five GFN binaries separately - all in different folders.

The base-specific initializations are embedded into kernels, among other reasons - for speed. There are no switches. A future (xGFN) binary may be capable of all and any functions; if I will not think of a more elegant solution, there will be a sort of "fat binary" packing of a dozen of specialized kernels per each current kernel.

frmky 2012-09-26 07:08

[QUOTE=Batalov;312724](bases 3,5,6,10,12 instead of 2).[/QUOTE]
Any reason these bases were chosen, other than to check the same bases as Proth.exe?

Batalov 2012-09-26 07:20

I thought that too. It is easy to extend technically, but how would we know new factors from old? And who would volunteer to keep the factors and limits? (FactorDB has a limit on size; no pasaran).
It is probably because of the [URL="http://www.ams.org/journals/mcom/1998-67-221/S0025-5718-98-00891-6/S0025-5718-98-00891-6.pdf"]Bjorn/Riesel[/URL] legacy (and earlier Riesel references [3,4] therein).

frmky 2012-09-26 07:49

BR98 also includes bases 7, 8, and 11. If the idea is to only eliminate potential primes then these should be excluded, but so should 3 and 5. Perhaps the idea was to include the smallest bases, 3 & 5, plus bases that can include primes up to 12. Just a guess. :piggie:

Batalov 2012-09-26 08:00

Anyhow, W.Keller is keen on dealing with reservations on the 42[COLOR=red]*[/COLOR] (a,b) pairs and the five a's altogether. So, even though I would like to reserve the 25<=m<=150, k<=1e13, but I can only do that with xGFNs included. (reimplement "pfgw -gxo" in mmff-xgfn)

So, I plan to make mmff-xgfn. It would make sense: With five exponentiations (2,3,5,7,11), we will learn all GFN (including 7 and 11!) and xGFN factors in one sweep (some modular linear combinations need to be done; that's all)

GF'(3) and GF'(5) can be prime (after the /2)!
[I]F'[/I][SUB]6[/SUB](3) = 1716841910146256242328924544641 is prime, hey, larger than F(4)!

I also silently hope that every ~2000th found Proth prime at PrimeGrid could be a ticket, too. 2 found, ~1998 to go. ;-)
_______
[COLOR=red]*[/COLOR][SIZE=1]gotta love the number![/SIZE]

lalera 2012-09-29 17:36

hi,
i have mmff v 0.25 for linux x64 and the k - range is missing in the results.txt
is this a bug?
an example:
no factor for k*2^47+1 in (96-bit factors) [mmff 0.25 mfaktc_barrett96_F32_63gs]

Prime95 2012-09-29 18:45

[QUOTE=lalera;313180]hi,
i have mmff v 0.25 for linux x64 and the k - range is missing in the results.txt
is this a bug?
an example:
no factor for k*2^47+1 in (96-bit factors) [mmff 0.25 mfaktc_barrett96_F32_63gs][/QUOTE]

looks like a bug to me. Can you post the worktodo.txt file that generated that output?

lalera 2012-09-29 18:49

[QUOTE=Prime95;313187]looks like a bug to me. Can you post the worktodo.txt file that generated that output?[/QUOTE]
FermatFactor=47,560e12,1000e12

lalera 2012-09-29 18:58

[QUOTE=Prime95;313187]looks like a bug to me. Can you post the worktodo.txt file that generated that output?[/QUOTE]
i did now the following test:
[CODE]worktodo:

FermatFactor=36,25709e6,25710e6
FermatFactor=33,5460e9,5470e9
FermatFactor=39,69,70
FermatFactor=45,11131e10,11132e10
FermatFactor=45,212e9,213e9
FermatFactor=50,2139e9,2140e9
FermatFactor=54,78,79
FermatFactor=54,81900e9,81911e9
FermatFactor=74,100,101
FermatFactor=79,5e9,6e9
FermatFactor=87,1595e9,1596e9
FermatFactor=88,20018e9,20019e9
FermatFactor=90,119e9,120e9
FermatFactor=92,198e9,199e9
FermatFactor=97,482e9,483e9
FermatFactor=101,3334e9,3335e9
FermatFactor=111,141,142
FermatFactor=120,3e9,4e9
FermatFactor=135,880e8,881e8
FermatFactor=148,173,174
FermatFactor=149,175,176

results:

F28 has a factor: 1766730974551267606529 [TF:70:71:mmff 0.25 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^36+1 in (71-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs]
F31 has a factor: 46931635677864055013377 [TF:75:76:mmff 0.25 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^33+1 in (76-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs]
F37 has a factor: 701179711390136401921 [TF:69:70:mmff 0.25 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^39+1 in (70-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs]
F42 has a factor: 3916660235220715932328394753 [TF:91:92:mmff 0.25 mfaktc_barrett96_F32_63gs]
found 1 factor for k*2^45+1 in (92-bit factors) [mmff 0.25 mfaktc_barrett96_F32_63gs]
F43 has a factor: 7482850493766970889994241 [TF:82:83:mmff 0.25 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^45+1 in (83-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs]
F48 has a factor: 2408911986953445595315961857 [TF:90:91:mmff 0.25 mfaktc_barrett96_F32_63gs]
found 1 factor for k*2^50+1 in (91-bit factors) [mmff 0.25 mfaktc_barrett96_F32_63gs]
F52 has a factor: 389591181597081096683521 [TF:78:79:mmff 0.25 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^54+1 in (79-bit factors) [mmff 0.25 mfaktc_barrett89_F32_63gs]
F52 has a factor: 1475547810493913550438096961537 [TF:100:101:mmff 0.25 mfaktc_barrett108_F32_63gs]
found 1 factor for k*2^54+1 in (101-bit factors) [mmff 0.25 mfaktc_barrett108_F32_63gs]
F72 has a factor: 1443765874709062348345951911937 [TF:100:101:mmff 0.25 mfaktc_barrett108_F64_95gs]
found 1 factor for k*2^74+1 in (101-bit factors) [mmff 0.25 mfaktc_barrett108_F64_95gs]
F77 has a factor: 3590715923977960355577974656860161 [TF:111:112:mmff 0.25 mfaktc_barrett120_F64_95gs]
found 1 factor for k*2^79+1 in (112-bit factors) [mmff 0.25 mfaktc_barrett120_F64_95gs]
F83 has a factor: 246947940268608417020015902258307792897 [TF:127:128:mmff 0.25 mfaktc_barrett128_F64_95gs]
found 1 factor for k*2^87+1 in (128-bit factors) [mmff 0.25 mfaktc_barrett128_F64_95gs]
F86 has a factor: 6195449970597928748332522715641578258433 [TF:132:133:mmff 0.25 mfaktc_barrett140_F64_95gs]
found 1 factor for k*2^88+1 in (133-bit factors) [mmff 0.25 mfaktc_barrett140_F64_95gs]
F88 has a factor: 148481934042154969241780501829489000449 [TF:126:127:mmff 0.25 mfaktc_barrett128_F64_95gs]
found 1 factor for k*2^90+1 in (127-bit factors) [mmff 0.25 mfaktc_barrett128_F64_95gs]
F90 has a factor: 985016348367230226078056532654006730753 [TF:129:130:mmff 0.25 mfaktc_barrett140_F64_95gs]
found 1 factor for k*2^92+1 in (130-bit factors) [mmff 0.25 mfaktc_barrett140_F64_95gs]
F94 has a factor: 76459067246115642538831634131564386844673 [TF:135:136:mmff 0.25 mfaktc_barrett140_F96_127gs]
found 1 factor for k*2^97+1 in (136-bit factors) [mmff 0.25 mfaktc_barrett140_F96_127gs]
F96 has a factor: 8453027931784477309850388309101819121893377 [TF:142:143:mmff 0.25 mfaktc_barrett152_F96_127gs]
found 1 factor for k*2^101+1 in (143-bit factors) [mmff 0.25 mfaktc_barrett152_F96_127gs]
F107 has a factor: 3346902437331832346018436558958369334886401 [TF:141:142:mmff 0.25 mfaktc_barrett152_F96_127gs]
found 1 factor for k*2^111+1 in (142-bit factors) [mmff 0.25 mfaktc_barrett152_F96_127gs]
F116 has a factor: 4563438810603420826872624280490561141381005313 [TF:151:152:mmff 0.25 mfaktc_barrett152_F96_127gs]
found 1 factor for k*2^120+1 in (152-bit factors) [mmff 0.25 mfaktc_barrett152_F96_127gs]
F133 has a factor: 3836232386548105510567872577199319351015739156856833 [TF:171:172:mmff 0.25 mfaktc_barrett172_F128_159gs]
found 1 factor for k*2^135+1 in (172-bit factors) [mmff 0.25 mfaktc_barrett172_F128_159gs]
F146 has a factor: 13235038053749721162769301995307025251972223086886913 [TF:173:174:mmff 0.25 mfaktc_barrett183_F128_159gs]
found 1 factor for k*2^148+1 in (174-bit factors) [mmff 0.25 mfaktc_barrett183_F128_159gs]
F147 has a factor: 88894220732640180500173831441107513117330143465963521 [TF:175:176:mmff 0.25 mfaktc_barrett183_F128_159gs]
found 1 factor for k*2^149+1 in (176-bit factors) [mmff 0.25 mfaktc_barrett183_F128_159gs][/CODE]

Prime95 2012-09-29 20:02

1 Attachment(s)
Minor update -- v 0.26:

What's new:

1) MM31 is now supported.
2) Lalera's "missing k range" output bug is probably fixed.

Prime95 2012-09-29 20:02

1 Attachment(s)
Linux 64-bit executable:

lalera 2012-09-29 20:23

hi,
mmff v0.26 for linux x64 works fine
i did the following test:
[CODE]worktodo:

FermatFactor=36,25709e6,25710e6
FermatFactor=33,5460e9,5470e9
FermatFactor=39,69,70
FermatFactor=45,11131e10,11132e10
FermatFactor=45,212e9,213e9
FermatFactor=50,2139e9,2140e9
FermatFactor=54,78,79
FermatFactor=54,81900e9,81911e9
FermatFactor=74,100,101
FermatFactor=79,5e9,6e9
FermatFactor=87,1595e9,1596e9
FermatFactor=88,20018e9,20019e9
FermatFactor=90,119e9,120e9
FermatFactor=92,198e9,199e9
FermatFactor=97,482e9,483e9
FermatFactor=101,3334e9,3335e9
FermatFactor=111,141,142
FermatFactor=120,3e9,4e9
FermatFactor=135,880e8,881e8
FermatFactor=148,173,174
FermatFactor=149,175,176

results:

F28 has a factor: 1766730974551267606529 [TF:70:71:mmff 0.26 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^36+1 in k range: 25709M to 25710M (71-bit factors) [mmff 0.26 mfaktc_barrett89_F32_63gs]
F31 has a factor: 46931635677864055013377 [TF:75:76:mmff 0.26 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^33+1 in k range: 5460G to 5470G (76-bit factors) [mmff 0.26 mfaktc_barrett89_F32_63gs]
F37 has a factor: 701179711390136401921 [TF:69:70:mmff 0.26 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^39+1 in k range: 1073741824 to 2147483647 (70-bit factors) [mmff 0.26 mfaktc_barrett89_F32_63gs]
F42 has a factor: 3916660235220715932328394753 [TF:91:92:mmff 0.26 mfaktc_barrett96_F32_63gs]
found 1 factor for k*2^45+1 in k range: 111310G to 111320G (92-bit factors) [mmff 0.26 mfaktc_barrett96_F32_63gs]
F43 has a factor: 7482850493766970889994241 [TF:82:83:mmff 0.26 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^45+1 in k range: 212G to 213G (83-bit factors) [mmff 0.26 mfaktc_barrett89_F32_63gs]
F48 has a factor: 2408911986953445595315961857 [TF:90:91:mmff 0.26 mfaktc_barrett96_F32_63gs]
found 1 factor for k*2^50+1 in k range: 2139G to 2140G (91-bit factors) [mmff 0.26 mfaktc_barrett96_F32_63gs]
F52 has a factor: 389591181597081096683521 [TF:78:79:mmff 0.26 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^54+1 in k range: 16777216 to 33554431 (79-bit factors) [mmff 0.26 mfaktc_barrett89_F32_63gs]
F52 has a factor: 1475547810493913550438096961537 [TF:100:101:mmff 0.26 mfaktc_barrett108_F32_63gs]
found 1 factor for k*2^54+1 in k range: 81900G to 81911G (101-bit factors) [mmff 0.26 mfaktc_barrett108_F32_63gs]
F72 has a factor: 1443765874709062348345951911937 [TF:100:101:mmff 0.26 mfaktc_barrett108_F64_95gs]
found 1 factor for k*2^74+1 in k range: 67108864 to 134217727 (101-bit factors) [mmff 0.26 mfaktc_barrett108_F64_95gs]
F77 has a factor: 3590715923977960355577974656860161 [TF:111:112:mmff 0.26 mfaktc_barrett120_F64_95gs]
found 1 factor for k*2^79+1 in k range: 5G to 6G (112-bit factors) [mmff 0.26 mfaktc_barrett120_F64_95gs]
F83 has a factor: 246947940268608417020015902258307792897 [TF:127:128:mmff 0.26 mfaktc_barrett128_F64_95gs]
found 1 factor for k*2^87+1 in k range: 1595G to 1596G (128-bit factors) [mmff 0.26 mfaktc_barrett128_F64_95gs]
F86 has a factor: 6195449970597928748332522715641578258433 [TF:132:133:mmff 0.26 mfaktc_barrett140_F64_95gs]
found 1 factor for k*2^88+1 in k range: 20018G to 20019G (133-bit factors) [mmff 0.26 mfaktc_barrett140_F64_95gs]
F88 has a factor: 148481934042154969241780501829489000449 [TF:126:127:mmff 0.26 mfaktc_barrett128_F64_95gs]
found 1 factor for k*2^90+1 in k range: 119G to 120G (127-bit factors) [mmff 0.26 mfaktc_barrett128_F64_95gs]
F90 has a factor: 985016348367230226078056532654006730753 [TF:129:130:mmff 0.26 mfaktc_barrett140_F64_95gs]
found 1 factor for k*2^92+1 in k range: 198G to 199G (130-bit factors) [mmff 0.26 mfaktc_barrett140_F64_95gs]
F94 has a factor: 76459067246115642538831634131564386844673 [TF:135:136:mmff 0.26 mfaktc_barrett140_F96_127gs]
found 1 factor for k*2^97+1 in k range: 482G to 483G (136-bit factors) [mmff 0.26 mfaktc_barrett140_F96_127gs]
F96 has a factor: 8453027931784477309850388309101819121893377 [TF:142:143:mmff 0.26 mfaktc_barrett152_F96_127gs]
found 1 factor for k*2^101+1 in k range: 3334G to 3335G (143-bit factors) [mmff 0.26 mfaktc_barrett152_F96_127gs]
F107 has a factor: 3346902437331832346018436558958369334886401 [TF:141:142:mmff 0.26 mfaktc_barrett152_F96_127gs]
found 1 factor for k*2^111+1 in k range: 1073741824 to 2147483647 (142-bit factors) [mmff 0.26 mfaktc_barrett152_F96_127gs]
F116 has a factor: 4563438810603420826872624280490561141381005313 [TF:151:152:mmff 0.26 mfaktc_barrett152_F96_127gs]
found 1 factor for k*2^120+1 in k range: 3G to 4G (152-bit factors) [mmff 0.26 mfaktc_barrett152_F96_127gs]
F133 has a factor: 3836232386548105510567872577199319351015739156856833 [TF:171:172:mmff 0.26 mfaktc_barrett172_F128_159gs]
found 1 factor for k*2^135+1 in k range: 88000M to 88100M (172-bit factors) [mmff 0.26 mfaktc_barrett172_F128_159gs]
F146 has a factor: 13235038053749721162769301995307025251972223086886913 [TF:173:174:mmff 0.26 mfaktc_barrett183_F128_159gs]
found 1 factor for k*2^148+1 in k range: 33554432 to 67108863 (174-bit factors) [mmff 0.26 mfaktc_barrett183_F128_159gs]
F147 has a factor: 88894220732640180500173831441107513117330143465963521 [TF:175:176:mmff 0.26 mfaktc_barrett183_F128_159gs]
found 1 factor for k*2^149+1 in k range: 67108864 to 134217727 (176-bit factors) [mmff 0.26 mfaktc_barrett183_F128_159gs][/CODE]

RichD 2012-09-29 22:24

[QUOTE=lalera;313180]hi,
i have mmff v 0.25 for linux x64 and the k - range is missing in the results.txt
is this a bug?
an example:
no factor for k*2^47+1 in (96-bit factors) [mmff 0.25 mfaktc_barrett96_F32_63gs][/QUOTE]

I had the same thing in mmff 0.23. See [url="http://mersenneforum.org/showpost.php?p=313086&postcount=38"]this post[/url].

Sorry, but I'm not going to re-run my eight day test. :-)

firejuggler 2012-09-30 03:04

Looking for a small FermatFactor range... ideally 8-12 hours long on a 560

edit : choosed it
[code]
FermatFactor=60,200e12,210e12
FermatFactor=60,210e12,220e12
FermatFactor=60,220e12,230e12
FermatFactor=60,230e12,240e12
FermatFactor=60,240e12,250e12
FermatFactor=60,250e12,260e12
FermatFactor=60,260e12,270e12
FermatFactor=60,270e12,280e12
FermatFactor=60,280e12,290e12
FermatFactor=60,290e12,300e12
[/code]
-> FermatFactor=60,200e12,300e12

flashjh 2012-09-30 03:44

1 Attachment(s)
v.26 Windows 32-bit & 64-bit executables:

aketilander 2012-10-08 21:31

Upgrading Cuda 4.2 SDK
 
After upgrading Cuda 4.2 SDK 64-bit to the latest version my mmff0.26 runs more then twice as fast.

tServo 2012-10-09 00:27

[QUOTE=aketilander;314027]After upgrading Cuda 4.2 SDK 64-bit to the latest version my mmff0.26 runs more then twice as fast.[/QUOTE]

Did you upgrade the video driver as well? That's what would impact performance rather than the SDK, which is basically a bunch of demo programs.

Whoops! The SDK might have some needed DLLs.

aketilander 2012-10-09 19:04

[QUOTE=tServo;314038]Did you upgrade the video driver as well? That's what would impact performance rather than the SDK, which is basically a bunch of demo programs.

Whoops! The SDK might have some needed DLLs.[/QUOTE]

Sorry, maybe I was too fast to spread the happy news. mmff0.26 runs for a while and then I get the error-message: "ERROR: cudaGetLastError() returned 30: unknown error" When I restart the program everything runs just fine, BUT the speed is about half of what it was from the beginning.

This is not a new error I had it before the upgrade as well.

I have a second computer with the same videocard. The program runs on the higher speed without problems.

So, anybody knows what is wrong?

frmky 2012-10-10 19:09

A minor issue, but with gcc 4.6.3, I have to move $(LDFLAGS) to the end in the final link command in the Makefile:

[CODE]../mmff.exe : $(COBJS) $(CUOBJS)
$(LD) $^ -o $@ $(LDFLAGS) [/CODE]

aketilander 2012-10-11 19:34

[QUOTE=aketilander;314082]mmff0.26 runs for a while and then I get the error-message: "ERROR: cudaGetLastError() returned 30: unknown error" When I restart the program everything runs just fine, BUT the speed is about half of what it was from the beginning.[/QUOTE]

I took down the core clock speed of the videocard a little bit and the problem seems to have disappeared. The program seems to be stable now. If the problem reappears I will of course report it here.

Batalov 2012-10-12 05:50

1 Attachment(s)
The latest mmff-gfn-0.26.zip is attached. (Contains some latest minor changes and the 0.26 backbone.)

Jerry flashjh, could you please build the Win32+Win64 binaries for each BASE (edit tf_gfn.h and delete all objects between each buils) and email them to Xyzzy for organized storage? Thank you in advance.

ET_ 2012-10-12 09:58

[QUOTE=Batalov;314366]The latest mmff-gfn-0.26.zip is attached. (Contains some latest minor changes and the 0.26 backbone.)

Jerry flashjh, could you please build the Win32+Win64 binaries for each BASE (edit tf_gfn.h and delete all objects between each buils) and email them to Xyzzy for organized storage? Thank you in advance.[/QUOTE]

Thank you Serge.

Is a version ready for xGF and with every search bundled in it just on course?

Luigi

Batalov 2012-10-12 10:17

[QUOTE=ET_;314379]Thank you Serge.

Is a version ready for xGF and with every search bundled in it just on course?

Luigi[/QUOTE]
No, haven't started yet with xGFN; also have some doubts - we can run out of registers.
Over weekend, I possibly will try to build a toy one - using only 2[SUP]2[SUP]m[/SUP][/SUP] and 3[SUP]2[SUP]m[/SUP][/SUP] (which after linear combinations makes for F, GF(3), GF(6), GF(12), and xGF(2,3), xGF(2,9), xGF(3,4), xGF(3,8)). Ughhh, it is going to be ugly. And slow, too!

flashjh 2012-10-12 13:36

Win32/64 binaries uploaded and email sent.

Jerry

aketilander 2012-10-20 13:14

Upper limit of mmff 0.26 TF MM127
 
I have done some tests to see where the upper limit of mmff 0.26 is. If I look into the source code there is a kernal barrett188 which is suppost to cover TFs up to ^188 bits (that is k=1,152,921e12 for MM127), but when I try to run the program with k:s that large I get the error-message "Exponentiation failure". Using smaller k:s the upper limit seems to be around 420,000e12.

Just out of curiosity it would be interesting to know if this is due to limitations of my video card or if it is a limitation of the program as such?

Prime95 2012-10-20 14:37

[QUOTE=aketilander;315304]I have done some tests to see where the upper limit of mmff 0.26 is. If I look into the source code there is a kernal barrett188 which is suppost to cover TFs up to ^188 bits (that is k=1,152,921e12 for MM127), but when I try to run the program with k:s that large I get the error-message "Exponentiation failure". Using smaller k:s the upper limit seems to be around 420,000e12.

Just out of curiosity it would be interesting to know if this is due to limitations of my video card or if it is a limitation of the program as such?[/QUOTE]

Sounds like a program bug where I've miscalculated the upper limit of what the kernel can handle. At least the automatic QA caught the problem.

ATH 2012-10-20 20:15

[QUOTE=aketilander;315304]I have done some tests to see where the upper limit of mmff 0.26 is. If I look into the source code there is a kernal barrett188 which is suppost to cover TFs up to ^188 bits (that is k=1,152,921e12 for MM127), but when I try to run the program with k:s that large I get the error-message "Exponentiation failure". Using smaller k:s the upper limit seems to be around 420,000e12.

Just out of curiosity it would be interesting to know if this is due to limitations of my video card or if it is a limitation of the program as such?[/QUOTE]

I tested with ranges of 1e10 and it seems the higher the k the higher risk of the error happening but sometimes you don't get the error. The lowest I have got it at was k=280,000e12.

Here is 3 instances of the error with the -v 3 option:
[URL="http://www.hoegge.dk/mersenne/MM127error.txt"]MM127error.txt[/URL]

ATH 2012-10-22 14:57

I tried running a small range for the first time with mmff. So far I have only been troubleshooting because I think the fan on my GTX 460 might break soon and because I can't really afford a higher power bill than I have currently (I have complained a lot about danish power prices in other threads, so I'll spare the details here).

But to my surprise my small range finished 5x quicker than I expected, so it turns out I apparantly never understood the "raw rate" output of mmff.

My "raw rate" was around 368 M/s (and later dropped to 347 M/s), so I figured this range would take: (1200e12-2^50)/368e6 = 201e3 sec = 56h. But it finished after 12h4min! So my k-rate was (1200e12-2^50)/43440sec = 1706 M/s ?

So I noticed each output line says "candidates 16.04G" and took rougly 43-46s so thats where the "raw rate" of 347M/s - 368M/s comes from, but what is that measuring exactly? Candidates is number of probable primes to test divisibility with?


[CODE]Starting trial factoring of k*2^46+1 in k range: 1125899906842624 to 12000000000
00000 (97-bit factors)
k_min = 1125899906842624
k_max = 1200000000000000
Using GPU kernel "mfaktc_barrett108_F32_63gs"
class | candidates | time | ETA | raw rate | SievePrimes | CPU wait
3/4620 | 16.04G | 43.589s | 11h36m | 367.96M/s | 210485
.
.
4613/4620 | 16.04G | 46.216s | 0m00s | 347.04M/s | 210485
no factor for k*2^46+1 in k range: 1125899906842624 to 1200000000000000 (97-bit
factors) [mmff 0.26 mfaktc_barrett108_F32_63gs]
tf(): total time spent: 12h 4m 0.423s[/CODE]

Dubslow 2012-10-22 18:41

[QUOTE=ATH;315507]
My "raw rate" was around 368 M/s (and later dropped to 347 M/s), so I figured this range would take: (1200e12-2^50)/368e6 = 201e3 sec = 56h. But it finished after 12h4min! So my k-rate was (1200e12-2^50)/43440sec = 1706 M/s ?

So I noticed each output line says "candidates 16.04G" and took rougly 43-46s so thats where the "raw rate" of 347M/s - 368M/s comes from, but what is that measuring exactly? Candidates is number of probable primes to test divisibility with?


[CODE]Starting trial factoring of k*2^46+1 in k range: 1125899906842624 to 12000000000
00000 (97-bit factors)
k_min = 1125899906842624
k_max = 1200000000000000
Using GPU kernel "mfaktc_barrett108_F32_63gs"
class | candidates | time | ETA | raw rate | SievePrimes | CPU wait
3/4620 | 16.04G | 43.589s | 11h36m | 367.96M/s | 210485
.
.
4613/4620 | 16.04G | 46.216s | 0m00s | 347.04M/s | 210485
no factor for k*2^46+1 in k range: 1125899906842624 to 1200000000000000 (97-bit
factors) [mmff 0.26 mfaktc_barrett108_F32_63gs]
tf(): total time spent: 12h 4m 0.423s[/CODE][/QUOTE]
The sieving eliminates k-candidates whose factor q is divisible by small primes. The rest are then trial-divided into the various Fermat numbers; judging by your k-count ETA, it seems that only a 5th of the potential q's escaped the small-prime-sieve whole. The raw rate, then, is how fast the card is trial dividing the remaining q. (If you used the [URL="http://www.mersenneforum.org/showpost.php?p=304626&postcount=1809"]compile time option[/URL] to [URL="http://www.mersenneforum.org/showpost.php?p=307238&postcount=1845"]disable the sieve[/URL], then the card would trial-divide every candidate, and the run would probably be about as long as you originally estimated.)

Prime95 2012-10-22 19:17

[QUOTE=ATH;315328]I tested with ranges of 1e10 and it seems the higher the k the higher risk of the error happening but sometimes you don't get the error. The lowest I have got it at was k=280,000e12.

Here is 3 instances of the error with the -v 3 option:
[URL="http://www.hoegge.dk/mersenne/MM127error.txt"]MM127error.txt[/URL][/QUOTE]


That was an easy fix. Note that in your output mmff is testing 187-bit factors with the barrett185 kernel. I fixed the typo so that mmff uses the 188-bit kernel, fixed a typo in the never-before-tested barrett188 kernel, and it's good to go. I'll release v. 0.27 after looking at Batalov's work. In the meantime, do not look for factors of MM127 that are more than 185 bits.

Prime95 2012-10-22 19:24

[QUOTE=ATH;315507]
My "raw rate" was around 368 M/s (and later dropped to 347 M/s), so I figured this range would take: (1200e12-2^50)/368e6 = 201e3 sec = 56h. But it finished after 12h4min! So my k-rate was (1200e12-2^50)/43440sec = 1706 M/s ?[/QUOTE]

Raw rate refers to number of k's that passed the "classes test". Note that you only test a small number of the 4620 classes. Even classes and clases where factors are divisible by 3,5,7,11 are eliminated before doing any GPU sieving.

Note that in mfaktc the column is called avg. rate and refers to the number of k's after the classes test AND after the CPU sieving.

aketilander 2012-10-22 19:27

[QUOTE=Prime95;315528]That was an easy fix.[/QUOTE]

Excellent! Thank you for taking your time doing it!

f11ksx 2012-10-23 04:54

Factor
 
Hello everybody,

i have this result with MMFF:
F39 has a factor: 304649306542939328584089601
[TF:87:88:mmff 0.26 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^44+1 in k range: 10000000000000 to 17592186044415 (88-bit factors) [mmff 0.26 mfaktc_barrett89_F32_63gs]


That is 17 317 308 137 475*2^44+1

but Fermat.exe find no factor.
Any idea ?

axn 2012-10-23 05:07

[QUOTE=f11ksx;315628]Hello everybody,

i have this result with MMFF:
F39 has a factor: 304649306542939328584089601
[TF:87:88:mmff 0.26 mfaktc_barrett89_F32_63gs]
found 1 factor for k*2^44+1 in k range: 10000000000000 to 17592186044415 (88-bit factors) [mmff 0.26 mfaktc_barrett89_F32_63gs]


That is 17 317 308 137 475*2^44+1

but Fermat.exe find no factor.
Any idea ?[/QUOTE]

It is a composite factor. 304649306542939328584089601 = (3*2^41+1)*(21*2^41+1)

3*2^24+1 divides F38.
21*2^41+1 divides F39.

Hmmm... How come the composite factor divides F39?

Batalov 2012-10-23 05:17

Nah, it's ok; I've seen this with mmff-gfn.
The exit criterion from factoring (repeated squaring) is either -1 modulus or 1 (probably in case that -1 went unnoticed). So, when mod is 1, then it gets obviously repeated forever. A factor (a product of two prime factors that divide two different Fm values) can get through like that. Because of the implementation details, pfgw -gxo will get equally confused (try it!), but it will produce a less misleading answer.

Actually, pfgw is not fooled by this number in -go mode, only in -gxo:

[CODE]> pfgw -f -gxo -q"17317308137475*2^44+1"
PFGW Version 3.4.6.64BIT.20110307.x86_Dev [GWNUM 26.5]

A GF Factor was found, but the base of 12 may not be correct.
17317308137475*2^44+1 is a Factor of xGF(12,3,2)!!!! (0.000000 seconds)
A GF Factor was found, but the base of 12 may not be correct.
17317308137475*2^44+1 is a Factor of xGF(12,4,3)!!!! (0.000000 seconds)
A GF Factor was found, but the base of 12 may not be correct.
17317308137475*2^44+1 is a Factor of xGF(12,8,3)!!!! (0.000000 seconds)
A GF Factor was found, but the base of 12 may not be correct.
17317308137475*2^44+1 is a Factor of xGF(12,9,2)!!!! (0.000000 seconds)
A GF Factor was found, but the base of 12 may not be correct.
17317308137475*2^44+1 is a Factor of xGF(12,9,8)!!!! (0.000000 seconds)
GFN testing completed
[/CODE]

ET_ 2012-10-23 07:45

[QUOTE=axn;315630]It is a composite factor. 304649306542939328584089601 = (3*2^41+1)*(21*2^41+1)

3*2^24+1 divides F38.
21*2^41+1 divides F39.

Hmmm... How come the composite factor divides F39?[/QUOTE]

May I suggest the insertion of a test to check if the factor is composite on mmff 0.27?

Luigi

axn 2012-10-23 07:45

[QUOTE=Batalov;315632]Nah, it's ok; I've seen this with mmff-gfn.
The exit criterion from factoring (repeated squaring) is either -1 modulus or 1 (probably in case that -1 went unnoticed). [/QUOTE]
Gotcha.

Batalov 2012-10-23 08:04

[QUOTE=ET_;315648]May I suggest the insertion of a test to check if the factor is composite on mmff 0.27?

Luigi[/QUOTE]
Checking for compositeness only (without factoring) is probably fairly cheap to add even using dumnums. Factoring (when the binary representation of k appears very sparse) is another possibility:
(s*2[SUP]m[/SUP]+1)*(t*2[SUP]m[/SUP]+1) = ([COLOR=blue]s*t*2[SUP]m[/SUP] + (s+t)[/COLOR])*2[SUP]m[/SUP]+1 = k*2[SUP]m[/SUP]+1
would be pretty easy to spot (with s and t small, s+t << 2[SUP]m[/SUP], though not necessarily both odd)

Maybe we could add this to mmff 0.28? It is easy to do externally for now.

P.S. Cheap demonstration for this particular k:
> dc
2o 17317308137475p
[COLOR=blue]111111[COLOR=black]000000000000000000000000000000000000[/COLOR]11[/COLOR]

RichD 2012-10-23 18:23

No Checkpoint/Restart ??
 
I've had to restart my mmff-0.26 run on MMFactor=127 but it doesn't appear to look for a .ckp file even though I have checkpoint turned on in the .ini file.

Is anyone else having this problem?

RichD 2012-10-24 01:34

On Linux
 
[QUOTE=RichD;315694]I've had to restart my mmff-0.26 run on MMFactor=127 but it doesn't appear to look for a .ckp file even though I have checkpoint turned on in the .ini file.[/QUOTE]

Luckily I spotted this as soon as it started and quickly renamed the .ckp file before quitting. Looking back through this thread I see I can use the -nocheck switch. It seems to be continuing OK. I first thought I may have lost 5-6 days...

aketilander 2012-11-03 09:01

Checksum or residue?
 
The TFs covered by mmff are so large that there will never ever be possible to do a LL to check for compositeness. Therefore I would guess that someone would sooner or later want to do double checks at least when the double mersennes are concerned. Would it be possible to add a checksum or a residue to mmff? Maybe a summarized residue of all the TFs done in a region?

I don't think that mmff has a test in the beginning so sometimes, when you don't find factors you start to distrust your system. Did I OC too much? Are there hardware errors? In that case it would be nice to be able to DC a couple of regions just to make sure that your system works OK.

So, would it be possible and meaningful to add some kind of checksum or residue to mmff?

pinhodecarlos 2012-11-03 14:05

I have a question, will mmff client detect the number of GPU's available or not? I mean, should the number of mmff clients be started as many times as the number of GPU cards installed?

Thank you in advance,

Carlos

flashjh 2012-11-03 15:01

[QUOTE=pinhodecarlos;316867]I have a question, will mmff client detect the number of GPU's available or not? I mean, should the number of mmff clients be started as many times as the number of GPU cards installed?

Thank you in advance,

Carlos[/QUOTE]

You can run mmff on multiple GPUs. Use the -d 0 or -d 1, etc.

Prime95 2012-11-03 16:21

[QUOTE=aketilander;316853]Would it be possible to add a checksum or a residue to mmff? [/QUOTE]

I don't see how to do this. Even if the second run used the same GPUSievePrimes value, the two runs would test a slightly different set of factors. This is because there are race conditions in the GPU sieve (it is faster to use non-atomic bit operations).

akruppa 2012-11-07 09:13

Is there any way to get statistics on the error rate of the trial divison? E.g., what approximate fraction of factors are missed due to race conditions, or due to consumer-grade cards skipping a beat, etc.

Prime95 2012-11-07 16:09

[QUOTE=akruppa;317379]Is there any way to get statistics on the error rate of the trial divison? E.g., what approximate fraction of factors are missed due to race conditions, or due to consumer-grade cards skipping a beat, etc.[/QUOTE]

I have no statistics on the error rate. The program does check two of the trial divisions in each class. So, a couple of thousand are double-checked in each bit-level. Fortunately, there have been few reports of validation failures.

However, the trial factors that are double-checked always come from thread id zero (of 256). If the CUDA scheduler always assigns this thread_id to specific CUDA cores, then a large number of CUDA cores are not getting any double-checking.

As to race conditions, there shouldn't be any problems. The program intentionally has race conditions, but these only cause a few extra composite factors to be tested.

aketilander 2012-11-16 08:03

I am wondering about the performance of different videocards. If I compare:

[URL]http://www.videocardbenchmark.net/high_end_gpus.html[/URL]

with James'

[URL]http://www.mersenne.ca/mfaktc.php[/URL]

there seems to be large differences. Lets take GTX 680 as an example. According to the first site its on the top of the list, but according to James' its not so good. I can see that James' lists someting called "Compute 3.0". I am not sure if this refers to the speed of the PCI-e bus or if this is something else. What I am really wondering is if there is an unused potential of some videocards which may be better used in the future when mmff is better adapted to "Compute 3.0" or maybe something else?

axn 2012-11-16 08:11

[QUOTE=aketilander;318528]I am wondering about the performance of different videocards. If I compare:

[URL]http://www.videocardbenchmark.net/high_end_gpus.html[/URL]

with James'

[URL]http://www.mersenne.ca/mfaktc.php[/URL]

there seems to be large differences.[/QUOTE]

For graphics performance (i.e. Games and stuff), 680 is great. For compute stuff (i.e. CUDA => mmff), not so much. I mean, it is still good, but not great.

henryzz 2012-11-16 11:46

Basically 680 is more gaming oriented than the 580. Nvidia have just announced their next version of tesla/server gpus which is computation orientated. Probably the next generation of home cards(7xx) will be based on the same technology. I would guess that 8xx will be gaming mainly etc..

MattcAnderson 2012-11-17 16:24

Hi All,
I finally got mmff going on my GPU. The error I had was cudart64_42_9.dll needed. from post # 185 of this thread, I found mmff v 23 with cuda 5.0, which seems to work. This seemed to solve my problem, but the windows 64 download from doublemersenne.org did not. I think this is because it has cuda 4.2. I guess my system requires cuda 5.0 and won't work with cuda 4.2.

Regards,
Matt

flashjh 2012-11-17 19:33

[QUOTE=MattcAnderson;318727]Hi All,
I finally got mmff going on my GPU. The error I had was cudart64_42_9.dll needed. from post # 185 of this thread, I found mmff v 23 with cuda 5.0, which seems to work. This seemed to solve my problem, but the windows 64 download from doublemersenne.org did not. I think this is because it has cuda 4.2. I guess my system requires cuda 5.0 and won't work with cuda 4.2.

Regards,
Matt[/QUOTE]

CUDA 4.2 version requires the 42 dll files [URL="http://sourceforge.net/projects/cudalucas/files/CUDA%20Libs/CUDA-4.2-Libs-Windows.7z/download"]here[/URL]. Just put them in the mmff directory.

ET_ 2012-12-04 17:54

mmff doesn't compile...
 
1 Attachment(s)
Ubuntu 11.10 - Linux64, CUDA drivers 5.0, CUDA runtime version 4.1

[code]
uigi@luigi-ubuntu:~$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.6.1/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.6.1-9ubuntu3' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++,go --prefix=/usr --program-suffix=-4.6 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin --enable-objc-gc --disable-werror --with-arch-32=i686 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3)
[/code]


As mmff 0.26 binary is compiled with CUDA runtime version 4.2, I couldn't run it out of the box, and tried recompiling it, but...

I had problems trying to compile mmff :sad: It seems that some classical C math functions don't link correctly, even moving the -lm switch on the Makefile at the beginning of the line.

In any case, I am appending the compile log, hoping that someone of good will will throw an eye on it...

Thank you anyway.

Luigi

akruppa 2012-12-04 18:07

The GNU linker by default processes object files and libraries in the order as they are specified on the command line, remembers which unresolved references there are from each file it processes, and tries to resolve those references with symbols in object files and libraries that [I]follow[/I]. This means putting -lm at the start of the command line has no effect: there are no unsatisfied references to math library functions at that point, and the unresolved references to libm from later files never get resolved because no -lm follows them.

Put -lm at the end, and generally a object/library file which contains functions that another file Y needs [I]after[/I] the file Y. If you have circular dependencies, you can use some command line switch (forgot which) to make gnu ld re-scan the objects on the command line until all symbols are resolved, but you rarely need that.

ET_ 2012-12-04 18:26

[QUOTE=akruppa;320470]The GNU linker by default processes object files and libraries in the order as they are specified on the command line, remembers which unresolved references there are from each file it processes, and tries to resolve those references with symbols in object files and libraries that [I]follow[/I]. This means putting -lm at the start of the command line has no effect: there are no unsatisfied references to math library functions at that point, and the unresolved references to libm from later files never get resolved because no -lm follows them.

Put -lm at the end, and generally a object/library file which contains functions that another file Y needs [I]after[/I] the file Y. If you have circular dependencies, you can use some command line switch (forgot which) to make gnu ld re-scan the objects on the command line until all symbols are resolved, but you rarely need that.[/QUOTE]

Thank you Alex.

Update: I tried a make clean followed by a make all (I forgot the "all" thingie... :redface: )

result:
[code]
nvcc fatal : Unsupported gpu architecture 'compute_30'
[/code]

So I just cancelled the portion that said

[code]
--generate-code arch=compute_30,code=sm_30
[/code]

as I successfully did for mfaktc.

result:
[code]
#error -- unsupported GNU version! gcc 4.6 and up are not supported!
[/code]

Question:

Does runtime version 5.0 correct this error, or I have to try some fancy link and download an older version of GCC?

Luigi

Batalov 2012-12-04 18:30

[QUOTE=ET_;320474]Question:

Does runtime version 5.0 correct this error, or I have to try some fancy link and download an older version of GCC?

Luigi[/QUOTE]
Good question. The answer is no.

Find this file and edit __GNUC_MINOR__ > 6 into __GNUC_MINOR__ > 9
[CODE]/usr/local/cuda/include/host_config.h:
#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ > 9)
[/CODE]

ET_ 2012-12-04 18:44

1 Attachment(s)
[QUOTE=Batalov;320475]Good question. The answer is no.

Find this file and edit __GNUC_MINOR__ > 6 into __GNUC_MINOR__ > 9
[CODE]/usr/local/cuda/include/host_config.h:
#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ > 9)
[/CODE][/QUOTE]

It went farther, compiled the ptxtras (there is a __f variable set and unised), but still complains for the math functions.

Log appended.

There is a request to restart the system, maybe a pending update; I will tell you if after the reboot something changes.

Thank you Serge.

Luigi

akruppa 2012-12-04 18:50

[code]
gcc -fPIC -L/usr/local/cuda/lib64/ [B]-lcudart -lm[/B] timer.o parse.o read_config.o mfaktc.o checkpoint.o signal_handler.o output.o tf_barrett96_gs.o gpusieve.o -o ../mmff.exe
[/code]

ET_ 2012-12-04 19:22

[QUOTE=akruppa;320480][code]
gcc -fPIC -L/usr/local/cuda/lib64/ [B]-lcudart -lm[/B] timer.o parse.o read_config.o mfaktc.o checkpoint.o signal_handler.o output.o tf_barrett96_gs.o gpusieve.o -o ../mmff.exe
[/code][/QUOTE]

That's the line come from George's Makefile.

Should I modify it?

Luigi

frmky 2012-12-04 19:38

Newer versions of gcc no longer support putting the libraries at the beginning. Try

[CODE]gcc -fPIC -L/usr/local/cuda/lib64/ -o ../mmff.exe timer.o parse.o read_config.o mfaktc.o checkpoint.o signal_handler.o output.o tf_barrett96_gs.o gpusieve.o -lcudart -lm[/CODE]

akruppa 2012-12-04 19:41

I just checked what I did on our work machine to make it compile:
[CODE]
kruppaal@quiche:~/mmff$ diff src/Makefile src.my/Makefile
14c14
< NVCCFLAGS += --generate-code arch=compute_20,code=sm_20 --generate-code arch=compute_30,code=sm_30
---
> NVCCFLAGS += --generate-code arch=compute_20,code=sm_20 # --generate-code arch=compute_30,code=sm_30
16c16
< NVCCFLAGS += --compiler-options=-Wall
---
> NVCCFLAGS += --compiler-options="-Wall -B/usr/lib/gcc/x86_64-linux-gnu/4.5.3/"
20c20,21
< LDFLAGS = -fPIC $(CUDA_LIB) -lcudart -lm
---
> LDFLAGS = -fPIC $(CUDA_LIB)
> MMFFLIB = -lcudart -lm
35c36
< $(LD) $(LDFLAGS) $^ -o $@
---
> $(LD) $(LDFLAGS) $^ $(MMFFLIB) -o $@
[/CODE]

The -B/usr/lib/gcc/x86_64-linux-gnu/4.5.3/ is an override because 4.6 is the default gcc, but 4.5.3 is installed as well; with this option, nvcc uses the 4.5.3 version and stops complaining.

ET_ 2012-12-04 19:58

Thank you Greg, your line worked great! :bow:

Alex, now I found an old Makefile from the project GPU-ECM:

[code]
# CUDA does not support gcc >= 4.6
# this can be useful if your default gcc is >= 4.6
# if not just comment the following two lines.
#CC_BIN:=/tmp/gcc45
#USE_THIS_CC:=--compiler-bindir $(CC_BIN)
[/code]

I totally forgot about it, but you were right: you (and I) had a lower version of GCC at that time. Thank you for yur last mssge, you gave me the clue to remember where I saw that message.

It's time to reserve some exponent... My mmff is already doing its selftest.

Thanks again! :hello:

Luigi

Batalov 2012-12-04 22:52

CUDA is not [I]supported[/I] with GCC >= 4.6, but it doesn't mean that it doesn't work. It means that if you write to them with a bug report, they will not take it. I use gcc version 4.7.1 20120723 [gcc-4_7-branch revision 189773] (SUSE Linux) and one odd thing that I had to add was -lstd++ (where something is expected to be defined); linker spat out this recommendation without me even asking. Weird, but the binaries work.

(Of course, something might not work, if they rely on some optimizations in some old way. More likely, they just don't want to be bothered.)

Prime95 2012-12-13 00:25

1 Attachment(s)
Minor update -- v 0.27:

What's new:

1) Bug in testing 187-bit factors of MM127 fixed.
2) -lm makefile bug fixed (I hope).
3) With Batalov's help, the next set of 32 n values in k*2^n+1 Fermat factor testing is available.

As always previous savefiles wiln not work with 0.27 unless the -nocheck argument is used.

My dual boot box is running Windows right now, so I don't have a Linux executable right now.

Batalov 2012-12-13 00:54

Under linux, I get some errors in or around
[CODE]gpusieve.cu(1991): error: expected a declaration

that's around innocuously looking
[B]else[/B] {
if (gpusieve_initialized) return;
}
[/CODE]
I'll diff to my branch when I get home. (it's ugly to figure out in vim over ssh.)

Prime95 2012-12-13 02:03

[QUOTE=Batalov;321484]Under linux, I get some errors [/QUOTE]

Try again. I haven't compiled this code in well over a month. I minor tweak had a typo.

Batalov 2012-12-13 02:31

Looks good.
Test passed:
[CODE]got assignment: k*2^41+1, k range 2864929972000000 to 2864929973000000 (93-bit factors)
Starting trial factoring of k*2^41+1 in k range: 2864929972M to 2864929973M (93-bit factors)
k_min = 2864929972000000
k_max = 2864929973000000
Using GPU kernel "mfaktc_barrett96_F32_63gs"
class | candidates | time | ETA | raw rate | SievePrimes | CPU wait
2471/4620 | 0.01M | 0.275s | 2m03s | 0.03M/s | 210485
F39 has a factor: 6300047635658008393597059073
[/CODE]


All times are UTC. The time now is 00:40.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.