![]() |
Half a million GHz-days?
1 Attachment(s)
I was just wondering if getting more than half a million GHz-days credit for one result makes sense or if something is completely broken here.
See attached screenshot: |
I have been using GMP-ECM to do stage 2 for P−1 testing, with stage 1 done by Prime95/mprime. In a nutshell, GMP-ECM is much more efficient for stage 2 for very small exponents; on the other hand it is impractical for larger exponents and entirely unusable for the 80M range where we do most of the P−1 testing these days.
In the past few days I found additional factors for M3617 and M4957 recently. The factors are 40 digits and 48 digits, respectively. I used B2 = 10[SUP]17[/SUP], which is orders of magnitude higher than the B2 limits achievable with Prime95/mprime. Incidentally this requires large amounts of memory to do efficiently, I used 200 to 250 GB for these. I manually reported the results and PrimeNet assigned a ridiculously huge amount of credit. This is similar to what happens when people use GPU-based programs for trial factoring: PrimeNet assigns credit as if Prime95/mprime had been used, instead of the actual more efficient program, leading to unrealistic values. This leads to the [URL="https://www.mersenne.org/report_top_500/"]overall top producer[/URL] rankings to be unfairly skewed in favor of users with a high percentage of TF relative to the other work types, but we tend to shrug and live with it. I consider GHz-days to be pretty meaningless anyway, so a lot of the crunching I do these days is uncredited anyway (e.g., all unsuccessful P−1 testing of exponents with already-known factors, and PRP testing). |
[QUOTE=GP2;452602]I consider GHz-days to be pretty meaningless anyway ...[/QUOTE]This I agree with. I long ago abandoned my accounts on primenet and now just do "anonymous" work.
If someone gets 500000 GHzDays, or whatever GHzDays, then that is just fine with me. Give them more if it encourages them more. It's all just numbers in a database anyway. |
[QUOTE=GP2;452602]This is similar to what happens when people use GPU-based programs for trial factoring: PrimeNet assigns credit as if Prime95/mprime had been used, instead of the actual more efficient program, leading to unrealistic values. [/QUOTE]
This is not at all similar to CPU TF/GPU TF situation. GPU TF has same algorithm as CPU TF -- in that sense, GPU TF is not more efficient than CPU TF. However, if we think of GPUs as faster CPUs, then the speedup we get from GPUs doing TF (over CPUs) is much higher than the speedup from GPUs doing LL. There is nothing unrealistic about the GHz-Days values that GPUs get for TF. |
[QUOTE=axn;452613]...
There is nothing unrealistic about the GHz-Days values that GPUs get for TF.[/QUOTE]Oh yes there is.If one goes for GHzdays electricity would be spent on TF on GPU's only. No LL testing or P-1 factoring would be done. There will come a time where GPU's will have to be used on LL testing. (Just try to prove the 13th Mersenne prime by TF alone.) Jacob |
[QUOTE=S485122;452621]There will come a time where GPU's will have to be used on LL testing. (Just try to prove the 13th Mersenne prime by TF alone.)[/QUOTE]
At the end of the day, GPUs are currently best at TF'ing because of their *many* simple processors without a lot of precision, while CPUs are best at LL'ing because they have a much wider range of instructions and registers with DP. And while the TF'ing can be easily parallel, the LL test must be serial. I don't think anyone really cares that much about the credits. When the quantum computers become retail, the owners will laugh in our general direction.... |
One might want to cap GHz-days credit at "GHz-days needed to eliminate this candidate by LL". The latter is certainly not a half a million GHz-days.
It is certainly fine to factor Mersenne numbers but this is more like also giving credits to people who do other Cunnigham list-type research. For GIMPS (as in "Great Internet Mersenne [B]Prime [/B]Search"), factoring a factored candidate is arguably meaningless. Remember times when credit for doing useless LL was subtracted when a reasonably small factor was found? There was a logic behind that - in the paradigm that GHz-days were an attempt to objectively measure throughput. If GHz-days are no longer in this paradigm and the new paradigm is that GHz-days are "wages" for the work done, then there are corollaries: you could changes these wages to steer users to do a more wanted work (the wages are higher) or "pay" less for equal but less wanted work. (PrimeGrid clearly follows this line of thought, and gives arbitrary markups to the different classes of tasks. Sometimes these markups are added, sometimes removed.) The two paradigms are not well compatible. |
[QUOTE=Batalov;452626]One might want to cap GHz-days credit at "GHz-days needed to eliminate this candidate by LL".[/QUOTE]
I wouldn't object to that, or some other form of cap, even a zero cap. It could be applied retroactively. If a factor is found by P−1, I do want to report that it was found by P−1 and the bounds that were used, because that is useful information for posterity, so I format the manual results text accordingly. The wacky enormous credit is just a side effect of that. If the exact same factor was found by ECM instead, the credits would be very much smaller. You are right that there is a certain Cunningham-project "factor philately" aspect to it which is tangential to the goal of finding new Mersenne primes, and credit should not be a motivation. |
The real problem here is that a factor found by ECM was reported as found by P-1. It has nothing to do with the credit system, or with the CPU, GPU, XPU or other hardware used.
It is NORMAL that people having better hardware get better credit. It is their money they spent to buy that hardware. This problem occurred many times in the past, including when only the CPUs were the only ”actors” on the stage. When very small factors (like 70 bits or so) were found by TF in minutes, but they were reported as P-1 and get tons P-1 of credit, because they were not smooth (had a big k) and it would have taken P-1 years to find them; or vice-versa, a very smooth 85-bit factor was found by P-1 in minutes and reported as TF finding, giving tons of TF credit, because it would have taken TF ages to find a 85 bit factor (on CPU) for a mersenne with a small exponent. Nothing new under the sun, we had some guys doing this as a daily job. They got bored and left after a while, or learned to cope. The real solution would be to allow people to report their findings in the right way. Of course, capping the credit is not a bad idea either. I said repeatedly in the past, finding a factor is cool, but let's not forget we are trying to find PRIMES here. The main goal of the project is to find primes. Factoring numbers already known to be composite does not help the project at all, contrarily, it takes resources away from it, resources which could be used to do useful work. But again, everybody can do whatever work type he/she likes and whatever makes them happy. You can not force me to do other work type than the one I really want to do, even if that is futile, but it makes me feel important. My money, my hardware, my electricity, my time, my blah-blah-blah (I have a lot of this!)... And to avoid further arguments, yes, I am the credit whore. :razz: No matter how altruistic some of you want to appear, half of you will pack their toys and go back home if the credits were removed. Don't tell me... Maybe not you personally, but we will lose half of the contributors for sure. Human nature... |
[QUOTE=LaurV;452664]The real problem here is that a factor found by ECM was reported as found by P-1.[/QUOTE]
No, the recent factors for M3617 and M4957 really were found by P−1. Or are you talking about some other past occurrence? |
[QUOTE=LaurV;452664]... half of you will pack their toys and go back home if the credits were removed.[/QUOTE]Another way to look at it is that some people prefer to join the "anonymous" team and become part of the computing awesomeness that is currently at the top of the list. :showoff:
|
[QUOTE=LaurV;452664]Factoring numbers already known to be composite does not help the project at all, contrarily, it takes resources away from it, resources which could be used to do useful work.[/QUOTE]
Sometimes it does help the project a bit. A factor is always a superior result to a residue. It's hard to verify an LL residue but trivial to verify a factor. There is a tiny chance that a false "double-checked" LL residue could somehow sneak into the database, but with factors we can be 100% certain. |
If anyone is curious about the factor of M4957 that was found recently using P−1, here is how it was done. This was a 48-digit factor, the fifth known factor for this exponent.
First I used mprime for stage 1, with B1=100000000000 (10[SUP]11[/SUP]) and B2=B1, in order to create the 760-byte binary file m0004957. I am unable to include this file as an attachment for some reason (the error message is "Invalid File"), but below is the output of od -v -A d -t x1 m0004957 on Linux: [CODE] 0000000 4b 39 7a 31 01 00 00 00 00 00 00 00 00 00 f0 3f 0000016 02 00 00 00 5d 13 00 00 ff ff ff ff 53 31 00 00 0000032 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0 3f 0000048 e8 1f a0 dd 00 00 00 00 00 e8 76 48 17 00 00 00 0000064 00 e8 76 48 17 00 00 00 00 e8 76 48 17 00 00 00 0000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000096 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000112 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0000128 00 00 00 00 00 00 00 00 9b 00 00 00 14 03 a1 ce 0000144 9f 88 52 9a c0 f9 98 02 86 97 54 5e ec 53 30 70 0000160 3c 03 40 10 30 7d f7 3c 06 fc e0 59 0f 5b b9 fe 0000176 56 b8 19 11 5e ce 1f 9e ac 93 28 5e 42 e3 2e 44 0000192 8f c0 48 16 52 9c 94 9b 49 4a 67 43 a5 67 2e 42 0000208 83 82 d2 d0 49 f9 49 b6 ee c0 c4 fb b6 8b 40 24 0000224 23 88 37 48 8a 1e a9 38 4c e0 91 d1 8e 2a 51 da 0000240 93 af f1 70 75 97 0f 7d 64 0a 28 53 1c 6d 46 73 0000256 ef 94 a2 97 9b 83 02 64 96 05 07 58 30 82 99 6d 0000272 6c 6f 93 0d 9b e3 c9 aa 39 6d d7 0a e4 20 0a 12 0000288 bd 8f 3c ee fb 62 06 b3 57 9e ff 15 fb b0 f9 e2 0000304 75 15 3a 97 b3 dc 89 5b 04 52 26 8c fa d7 c1 94 0000320 14 79 04 9d ef 45 80 8c 4b 7e 8c 41 3d 20 98 2d 0000336 80 cc f2 46 6c 20 e8 a8 6a 95 3e 7c 7e bf 6c 7f 0000352 ef a8 af 79 d5 70 20 62 e8 8a 92 7b 44 4f b9 75 0000368 4c 2c bc 81 33 6c 74 47 75 c3 cf f3 e5 ca 0a 4e 0000384 24 f2 27 b1 f0 5c 22 10 cc 46 71 b5 56 80 c2 09 0000400 02 43 78 35 81 d7 57 23 1a 5e 00 38 25 59 34 5b 0000416 f4 5b d3 57 7e 2d c6 de 14 8d 87 a4 e2 b3 c9 68 0000432 e1 a2 89 8d fa 55 34 81 69 3c 9f 82 00 26 0e c4 0000448 71 30 59 05 fa 0d 12 d2 a9 e5 72 23 05 82 b0 8d 0000464 1a 0f 4d d7 ac bd 89 63 09 44 3d 5e b7 18 5a 2f 0000480 c8 18 e7 a1 51 85 e1 0e 3b 33 d9 1e 87 52 0b 6f 0000496 50 e2 0c 7d c4 da d1 1a 98 a9 80 53 ae b1 8f 18 0000512 c4 bb 23 ba 21 31 b5 52 13 f6 b4 70 1a 9b 33 9f 0000528 3f ce 84 5d 5c 5e 74 7b 47 70 b1 e4 1e 3a b4 e2 0000544 cb c9 c1 b2 46 81 bb c1 ea b4 d6 8e fa 9a f0 3d 0000560 1b 3c e4 ac 92 47 d2 22 8f 60 77 bc 1b af 93 30 0000576 e3 9c 6b b9 69 54 c5 e5 c1 6e d2 c6 d3 5d 97 fb 0000592 b9 0f 76 35 0e 77 52 0b 7a 8f a8 a7 eb 3f 93 39 0000608 1a 3c ce c9 ec 2d 12 2d f5 a1 0d cc 46 c6 8c 06 0000624 79 13 ac 0d cb de 8f 54 73 7b 31 60 d6 f8 53 e0 0000640 fe 58 e0 3d a4 3e b7 87 2c 1e fc 18 3a 77 c9 82 0000656 76 05 91 df 0f 9b 53 71 c2 ce f9 45 b6 e9 49 d1 0000672 f8 0e 79 3b b4 ab 93 43 4f 4a 43 6a 9e 76 61 a4 0000688 c5 c5 11 6b e6 ad c6 88 b0 23 d5 35 14 4f b4 ff 0000704 53 8b 3c cc 1d f8 c4 49 53 dc 59 8a 5e e4 7f 15 0000720 65 47 18 62 88 32 f6 2e 0d e9 4f e0 22 4c 2b 29 0000736 f7 e7 d5 22 17 e0 90 9a a9 fc fa 8d 65 76 a3 86 0000752 b4 d6 ba 71 81 68 09 19 0000760 [/CODE] Then I converted this file to a text format usable as input by GMP-ECM. This input file is here: [CODE] METHOD=P-1; B1=100000000000; N=2^4957-1; X=0x1909688171bad6b486a376658dfafca99a90e01722d5e7f7292b4c22e04fe90d2ef6328862184765157fe45e8a59dc5349c4f81dcc3c8b53ffb44f1435d523b088c6ade66b11c5c5a461769e6a434a4f4393abb43b790ef8d149e9b645f9cec271539b0fdf91057682c9773a18fc1e2c87b73ea43de058fee053f8d660317b73548fdecb0dac1379068cc646cc0da1f52d122decc9ce3c1a39933feba7a88f7a0b52770e35760fb9fb975dd3c6d26ec1e5c55469b96b9ce33093af1bbc77608f22d24792ace43c1b3df09afa8ed6b4eac1bb8146b2c1c9cbe2b43a1ee4b170477b745e5c5d84ce3f9f339b1a70b4f61352b53121ba23bbc4188fb1ae5380a9981ad1dac47d0ce2506f0b52871ed9333b0ee18551a1e718c82f5a18b75e3d44096389bdacd74d0f1a8db082052372e5a9d2120dfa05593071c40e2600829f3c69813455fa8d89a2e168c9b3e2a4878d14dec62d7e57d35bf45b34592538005e1a2357d7813578430209c28056b57146cc10225cf0b127f2244e0acae5f3cfc37547746c3381bc2c4c75b94f447b928ae8622070d579afa8ef7f6cbf7e7c3e956aa8e8206c46f2cc802d98203d418c7e4b8c8045ef9d04791494c1d7fa8c2652045b89dcb3973a1575e2f9b0fb15ff9e57b30662fbee3c8fbd120a20e40ad76d39aac9e39b0d936f6c6d998230580705966402839b97a294ef73466d1c53280a647d0f977570f1af93da512a8ed191e04c38a91e8a4837882324408bb6fbc4c0eeb649f949d0d28283422e67a543674a499b949c521648c08f442ee3425e2893ac9e1fce5e1119b856feb95b0f59e0fc063cf77d301040033c703053ec5e5497860298f9c09a52889fcea10314; CHECKSUM=2864078336; PROGRAM=mprime; Y=0x0; X0=0x0; Y0=0x0; WHO=; TIME=; [/CODE] I wrote a small program to do the conversion. It is very straightforward, the m* save file from mprime just stores a gigantic little-endian number. You can see that the mprime save file ends with .. ba 71 81 68 09 19, while the above text file has X=0x1909688171ba... so the byte order is just reversed. Then I ran GMP-ECM very briefly in order to rediscover and divide by the known factors. The command was something like: [CODE] ecm -v -resume input_file -save new_input_file 1e11 1048114913753-1048114913753 [/CODE] This command took one second or less. Using B1=10[SUP]11[/SUP] matches the limit that was used for stage 1 with mprime, and is sufficient to rediscover the first three known factors, and then the B2 range was precisely chosen to also rediscover the fourth known factor, which has k = 173 × 484539959 × 13870099489 × 1048114913753 (see [url]http://www.mersenne.ca/exponent/4957[/url]) The new input file generated by the above command is here: [CODE] METHOD=P-1; B1=100000000000; N=(2^4957-1)/44142031558201193970865967211651546452732077743337797706913; X=0x2fdb8d24dc96a961a7a16f73c801331f947d571ba9698797beaa04b4dc02b3fa6fa73263d45661eb2d8cffeff74103ab0fdfac66d1c1ede26770b2f2a858cab21dde07782f3098c2ff0a582cf444c25eb014b511496819780cabad44424bbe42ccdf48e5bb224fde7a28313a72ead569b6fb6948800bbf302b991684e6bb56aee3f4b8ff8c82c166970f9613eaaa9ce74c7069828401a56852a4bbd3443626f1449e9fa6c44928b1d22b90d069451ee8415f9ab6e64ba3dca0d0fb3b9f7605fb9c30df4ba0238f9311e5d8ed33e5d126fe2affc0ea9904c0b6fb4dfcc0ebf59726ceac7072ef65ceda59d73fad5645620b4329e941d1ff315b709a0790485b12d3d533d9d7c38747056be948fc40a540b921790dd23b734f3d21e7bb01ef339be0ec3c2edfe1a1cc391176e90c2d4fb5cfa923f866e276bd70872f5a44bc5ae064c9578957c941312e402bb8cf17e5c7593edb118f1ece0daea87c404e13944d4a71218a20d81e4c98d771fe4ead878ad63119b168de7d49d053a2f2540187be7f6786854bedd44248ffeac15d099824cd00e821f564f8857558b2121a24f57de4752d5a318c051f7c22cd4042a26d0bba10bbad41c0a5137d68dcc0dd2930cb0ab15de4debe1175bbfb9deac9e2ef3ead708d866507e49c29194ccced16271ca7b879acb0252da5bd56a0fd172434f4db3c5d3b113a77b986df29e173b532add83cac6b3402b26bb4c93a469bd0bf1e5015feb45c76e7bccf6b8cba437d04ebebd3e3d13f15acbaf84e57db5c00bd39a26220635d847e85380f5bdec55ca3cea60d2b3829605b3079037206be7584e7f743d6f; CHECKSUM=1461664578; PROGRAM=GMP-ECM 7.0.4; Y=0x0; X0=0x0; Y0=0x0; WHO=ec2-user@ip-172-31-45-65; TIME=Fri Feb 3 02:58:03 2017; [/CODE] Finally, I used this new input file for the real GMP-ECM stage 2. The command used was: [CODE] ecm -maxmem 270000 -v -resume new_input_file 1e11 1e17 [/CODE] The output of GMP-ECM is shown below. As you can see, it used a lot of memory (about 250 GB): [CODE] GMP-ECM 7.0.4 [configured with GMP 6.1.1, --enable-asm-redc] [ECM] Running on ip-###-###-###-### Resuming P-1 residue saved by ec2-user@ip-###-###-###-### with GMP-ECM 7.0.4 on Fri Feb 3 02:58:03 2017 Input number is (2^4957-1)/44142031558201193970865967211651546452732077743337797706913 (1434 digits) Using special division for factor of 2^4957-1 Using lmax = 134217728 with NTT which takes about 242688MB of memory Using B1=100000000000-100000000000, B2=109863493910757900, polynomial x^1 P = 780825045, l = 134217728, s_1 = 63866880, k = s_2 = 4, m_1 = 59 Probability of finding a factor of n digits: 20 25 30 35 40 45 50 55 60 65 0.88 0.64 0.39 0.21 0.098 0.041 0.016 0.0055 0.0018 0.00056 Step 1 took 0ms Computing F from factored S_1 took 7663232ms Computing h took 1188768ms Computing DCT-I of h took 1709352ms Multi-point evaluation 1 of 4: Computing g_i took 4212248ms Computing g*h took 3598828ms Computing gcd of coefficients and N took 1376900ms Multi-point evaluation 2 of 4: Computing g_i took 4276688ms Computing g*h took 3600552ms Computing gcd of coefficients and N took 1368000ms Multi-point evaluation 3 of 4: Computing g_i took 4250568ms Computing g*h took 3576836ms Computing gcd of coefficients and N took 1372256ms Step 2 took 38208428ms ********** Factor found in step 2: 246983850981415853695555800914746071630106768073 Found prime factor of 48 digits: 246983850981415853695555800914746071630106768073 Composite cofactor ((2^4957-1)/44142031558201193970865967211651546452732077743337797706913)/246983850981415853695555800914746071630106768073 has 1387 digits Peak memory usage: 241640MB [/CODE] I am running P−1 on these small exponents, even though I'm becoming increasingly convinced that just running more ECM curves makes more sense. There are some factors that P−1 can't find, and using a large B2 means using a lot of memory. But I'll probably try to finish what I started. |
[QUOTE=LaurV;452664]The real problem here is that a factor found by ECM was reported as found by P-1.
This problem occurred many times in the past ... found by TF ... reported as P-1 The real solution would be to allow people to report their findings in the right way.[/QUOTE]This issue should have been largely resolved for a couple years already. Previously the manual result forms would look for the reported factors and just guess at the method used to find the factor. Now the whole results text is taken into account so if it's reported as TF / P-1 / ECM it will be recorded as such. Only if a factor is submitted without any context (e.g. just a "M123 has a factor: 12345" line) will it have to fall back to guessing. |
[QUOTE=GP2;452847]The output of GMP-ECM is shown below. As you can see, it used a lot of memory (about 250 GB):[/QUOTE]
This could also have been done on a "normal" computer with 8 or 16 GB RAM, you would just have to run stage2 in many stages instead of in 1 big run. But it is a lot faster this way using all that RAM. |
[QUOTE=ATH;452855]This could also have been done on a "normal" computer with 8 or 16 GB RAM, you would just have to run stage2 in many stages instead of in 1 big run. But it is a lot faster this way using all that RAM.[/QUOTE]
If you used only 8 GB and keep the same B2 limit, I think the run time might increase by two or three orders of magnitude. |
1 Attachment(s)
How is this possible?
See Attachment |
It is a manual result claiming to have done ECM with B1=15K and B2=2^64 which it did not. It might be a bug with a negative value entered as B2 or something that was not a number.
I guess George or Aaron can clear that result. |
[QUOTE=ATH;493206]It is a manual result claiming to have done ECM with B1=15K and B2=2^64 which it did not.
It might be a bug with a negative value entered as B2 or something that was not a number[/QUOTE]Strangely, it's not a misparsed value reported as 2[sup]64[/sup], the submitted result line included that B2 verbatim. It also contains a username and an assignment ID, so I'm not sure why it ended up anonymous. Presumably the assignment ID and B2 were forged (whether deliberately or accidentally I can't comment). |
CPU credit reduced to 0.001 GHz days. I did not investigate how this came to pass.
|
[QUOTE=LaurV;452664]
... And to avoid further arguments, yes, I am the credit whore. :razz: No matter how altruistic some of you want to appear, half of you will pack their toys and go back home if the credits were removed. Don't tell me... Maybe not you personally, but we will lose half of the contributors for sure. Human nature...[/QUOTE] If they were not being altruistic even getting Ghz days as a reward, then they would all be off mining Monero or something, or just leaving the computers off to save electricity. I don't think it's any loss to have people factoring numbers that are already proven composite. It's far more important that the project stays alive and healthy with interested participants, then to actually find primes. We will never run out of potential primes to test. If there was 10 times the Ghz days doing LL tests, things would not really be very different from now, but if there were 10 times the active interested participants talking about Mersenne numbers and sharing it with friends and coworkers, it would make a big difference, even if they didn't bring a single Ghz day to the project. |
Would it be also possible to fix the graph?
> [url]https://www.mersenne.org/primenet/graphs.php[/url] |
[QUOTE=ramgeis;493866]Would it be also possible to fix the graph?
> [url]https://www.mersenne.org/primenet/graphs.php[/url][/QUOTE]Yeah, I agree. Let's get rid of the JS. :tu: |
[QUOTE=ramgeis;493866]Would it be also possible to fix the graph?
> [url]https://www.mersenne.org/primenet/graphs.php[/url][/QUOTE] Done |
| All times are UTC. The time now is 14:40. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.