![]() |
[QUOTE=Prime95;504250]Fixed in next 29.5 build. The bug affected type-5 PRP tests with base != 2 and with known factors.[/QUOTE]
Perfect, thank you very much. When do you think the binary is available? |
[QUOTE=MrRepunit;504299]Perfect, thank you very much.
When do you think the binary is available?[/QUOTE] But your base-10 repunit project uses type 1 residues, so fixing the type-5 bug won't affect your work. The real reason to wait is because there is no Gerbicz error checking for repunits other than Mersenne and Wagstaff, so it's better to wait a while until the memory corruption issues with 29.5 beta are definitely ironed out. Although if you restrict to using only one or two cores, there doesn't seem to be a problem. By the way, for everyone else reading this, the (eventual) discovery of a new record base-10 repunit prime could certainly grab the public's imagination, and it's easier to explain than Mersenne primes. As in "11 is prime, but 111 = 3 × 37, 11111 = 41 × 271, etc. You have to go up to 19 ones to get the next prime. The largest known (probable) prime of this form has 270,343 ones..." The search for the next one has reached nearly 4 million ones without success. Maybe the Primenet infrastructure could be leveraged to also host Wagstaff or repunit-10 sister projects? Just a thought. |
[QUOTE=GP2;504317]...discovery of a new record base-10 repunit prime could certainly grab the public's imagination[/QUOTE]I think you overestimate the general public's interest in numbers. :smile:
|
[QUOTE=James Heinrich;504318]I think you overestimate the general public's interest in numbers. :smile:[/QUOTE]
And yet we still issue press releases when a new Mersenne prime is found... We're the thinking person's clickbait. |
[QUOTE=GP2;504319]We're the thinking person's clickbait.[/QUOTE]
New slogan for GIMPS? :unsure: |
[QUOTE=GP2;504317]But your base-10 repunit project uses type 1 residues, so fixing the type-5 bug won't affect your work.[/QUOTE]
I know, I was just hoping to be able to use the json output which holds more information. [QUOTE=GP2;504317] The real reason to wait is because there is no Gerbicz error checking for repunits other than Mersenne and Wagstaff, so it's better to wait a while until the memory corruption issues with 29.5 beta are definitely ironed out. Although if you restrict to using only one or two cores, there doesn't seem to be a problem.[/QUOTE] I did not follow the full thread, I should do it at some point. What I already asked if the Gerbicz test was going to be implemented once, see [URL]https://www.mersenneforum.org/showthread.php?p=468966#post468966[/URL][URL="https://www.mersenneforum.org/showpost.php?p=468966&postcount=56"].[/URL] So I want to ask again user Prime95: What is the reason for not implementing it for other bases other that +/-2? I think the adaptions for other bases should not be too complicated, I derived it myself for base 10. But there might be some specific implementation details that would make it too complicated. |
[QUOTE=MrRepunit;504401]
What is the reason for not implementing it for other bases other that +/-2? I think the adaptions for other bases should not be too complicated[/QUOTE] It is my understanding that Gerbicz error check only works base base 2 |
[QUOTE=MrRepunit;504401]So I want to ask again user Prime95:
What is the reason for not implementing it for other bases other that +/-2? I think the adaptions for other bases should not be too complicated, I derived it myself for base 10. But there might be some specific implementation details that would make it too complicated.[/QUOTE] [QUOTE=Prime95;504436]It is my understanding that Gerbicz error check only works base base 2[/QUOTE] GEC can be made to work for any base. But the efficient "squaring-only" computation sequence works only for base 2. For other bases, you will have mul mods as well, which will increase the run times significantly (about 50% ?). Of course, it still might be worth it, if you were planning on doublechecking your work (which you can dispense with when using GEC). |
[QUOTE=LaurV;504191]The "confidence" of the check is 50%, so your chances are equal, either way. You have 100% lose 5 days or 50% lose 10 days, hehe.You don't know what happens if you start again, it may repeat some error (the chances are not 100% to be successful, you may lose the 5 days, plus some other in the future, but of course, that was only a joke, because it sounded funny).
I would let it finish. And after, switch to PRP testing (where the error check is more robust), at least for next few exponents, to be sure the hardware is really fixed.[/QUOTE] I addressed the hardware, stress tested it for a couple days, then let the test finish. The result was good, it matched the first test. There is something I can't get my head around. I understand the Gerbicz error check has only a 50% chance of detecting an error, should one occur. But what causes it to report a false error? |
[QUOTE=PhilF;504946]I understand the Gerbicz error check has only a 50% chance of detecting an error, should one occur. But what causes it to report a false error?[/QUOTE]
Actually, the Gerbicz error check is for PRP testing, and it has a very strong likelihood of detecting an error. The Jacobi error check is for LL testing, and has only a 50% chance of detecting an error. Suppose I look at a coin lying on a table and I see that heads is facing up. I call heads. If you happen to be looking at a different coin for some reason, there's a 50% chance that you will see tails and realize that something's wrong, but also a 50% chance that your coin will also be heads and therefore you won't notice any problem. If the Jacobi check does report an error, it's certain that the current state of calculations is bad and has to be discarded. There is no false error. However, the program can go back to an earlier save file that passed the Jacobi check, and restart from there. Then you cross your fingers and hope that no 50-50 undetected error happened prior to that save file being saved. |
[QUOTE=GP2;504952]Actually, the Gerbicz error check is for PRP testing, and it has a very strong likelihood of detecting an error.
The Jacobi error check is for LL testing, and has only a 50% chance of detecting an error. Suppose I look at a coin lying on a table and I see that heads is facing up. I call heads. If you happen to be looking at a different coin for some reason, there's a 50% chance that you will see tails and realize that something's wrong, but also a 50% chance that your coin will also be heads and therefore you won't notice any problem. If the Jacobi check does report an error, it's certain that the current state of calculations is bad and has to be discarded. There is no false error. However, the program can go back to an earlier save file that passed the Jacobi check, and restart from there. Then you cross your fingers and hope that no 50-50 undetected error happened prior to that save file being saved.[/QUOTE] Sorry, you're right, I got my terms wrong. I did indeed mean the Jacobi error check during a LL test. In this case, it caught an error and tried to go back to a good save file, but couldn't. So it reported the chance of a good test as "fair". So I am still confused as to how it can catch a definite error, not be able to revert to a backup that fixes it, but produce a good result anyway (the test was a double check). |
[QUOTE=PhilF;504964]In this case, it caught an error and tried to go back to a good save file, but couldn't. So it reported the chance of a good test as "fair". So I am still confused as to how it can catch a definite error, not be able to revert to a backup that fixes it, but produce a good result anyway (the test was a double check).[/QUOTE]
OK, now everybody's confused. Any chance you could post a copy of the error or informational messages you got? Is it possible that out of multiple save files, it warned that it couldn't use one of them but then silently resumed from another, older one that did pass the Jacobi check? Is the test ongoing and when would it be expected to complete? Here's the [URL="https://mersenneforum.org/showthread.php?t=22471&p=465033"]original message that described and proposed the Jacobi check[/URL]. As I read it, every good interim or final residue is always −1, but every time an error occurs there's a coin flip and a 50-50 chance of getting either +1 or −1. Coin flips only happen when there's an error, so a +1 will not change back by itself unless there is a second error (or third, or higher). So a −1 can indicate: a) no errors b) one error and an unlucky coin flip c) two or more errors (and flips) with various results, but the final flip gave −1. Whereas a +1 can indicate: a) exactly one error b) two or more errors (and flips) with various results, but the final flip gave +1. So a +1 is absolutely an indication that you have a bad residue and you can't move forward from that point, only try to backtrack to some prior good save file. So there are various possibilities. Maybe the error messages are misleading and the program really did end up finding an older save file that passed the Jacobi check. Maybe the program has faulty error handling for Jacobi checks and fails to abort even when there are no good save files to fall back to, and instead defaults to the same handling used for older forms of error checking (roundoff errors, sumout errors, etc), where those kind of errors merely indicate that a result is suspect and should have higher priority for a quick double-check, rather than a guaranteed bad result. Or maybe you misread or misinterpreted the error messages. It's probably worth getting to the bottom of this. |
[QUOTE=PhilF;504964]Sorry, you're right, I got my terms wrong. I did indeed mean the Jacobi error check during a LL test.
In this case, it caught an error and tried to go back to a good save file, but couldn't. So it reported the chance of a good test as "fair". So I am still confused as to how it can catch a definite error, not be able to revert to a backup that fixes it, but produce a good result anyway (the test was a double check).[/QUOTE] I had this happen to me earlier in December when I was running the first DC test on new hardware. Jacobi error check failed and mprime started from the last save file. Confidence "fair" there too. The test went without errors to the end, but after a triple check the residue I produced was still bad. It is possible that this earlier save file was already "bad" but the error was such, that the Jacobi check didn't catch it (that 50% chance). In the default configuration, it is also possible that all intermediate files are bad, since Jacobi checks are only done every 12 hours, but files are saved every 30 minutes, and only three old files are kept. So, for example, if the error occurred after that last Jacobi check, but before that oldest file was saved, it's all gone. In my case, this was caused by over-optimistic memory overclocking that was stable elsewhere, but yet again, Prime95/mprime stresses the whole system like nothing else. :thumbs-up: Before this, I had maybe 99% confidence that the hardware is working and stable, but now I have 100%. (okay, maybe 99.99% - cosmic rays and no ECC, and everything...) The same machine has now produced four matching double check LL residues with no further errors, working on a further set of four, and after that I'll be switching to first time PRP tests. |
[QUOTE=GP2;504983]OK, now everybody's confused.
Any chance you could post a copy of the error or informational messages you got? Is it possible that out of multiple save files, it warned that it couldn't use one of them but then silently resumed from another, older one that did pass the Jacobi check? Is the test ongoing and when would it be expected to complete? Here's the [URL="https://mersenneforum.org/showthread.php?t=22471&p=465033"]original message that described and proposed the Jacobi check[/URL]. As I read it, every good interim or final residue is always −1, but every time an error occurs there's a coin flip and a 50-50 chance of getting either +1 or −1. Coin flips only happen when there's an error, so a +1 will not change back by itself unless there is a second error (or third, or higher). So a −1 can indicate: a) no errors b) one error and an unlucky coin flip c) two or more errors (and flips) with various results, but the final flip gave −1. Whereas a +1 can indicate: a) exactly one error b) two or more errors (and flips) with various results, but the final flip gave +1. So a +1 is absolutely an indication that you have a bad residue and you can't move forward from that point, only try to backtrack to some prior good save file. So there are various possibilities. Maybe the error messages are misleading and the program really did end up finding an older save file that passed the Jacobi check. Maybe the program has faulty error handling for Jacobi checks and fails to abort even when there are no good save files to fall back to, and instead defaults to the same handling used for older forms of error checking (roundoff errors, sumout errors, etc), where those kind of errors merely indicate that a result is suspect and should have higher priority for a quick double-check, rather than a guaranteed bad result. Or maybe you misread or misinterpreted the error messages. It's probably worth getting to the bottom of this.[/QUOTE] Here are the messages that started it: Iteration: 30271839/50930029, ERROR: Jacobi error check failed! Continuing from last save file. Error reading intermediate file: p9P30029 Renaming p9P30029 to p9P30029.bad1 Trying backup intermediate file: p9P30029.bu Error reading intermediate file: p9P30029.bu Renaming p9P30029.bu to p9P30029.bad2 Trying backup intermediate file: p9P30029.bu2 It might be worth noting that this machine is running from a USB stick, and is set for 2 save files instead of 3. It is not overclocked. After this, I corrected/tested the hardware, stress tested it for 50 hours, then let it complete the test. It kept reporting chances of a good result was "fair". The test turned out to be good, since it was a double check and the residues matched. |
[QUOTE=PhilF;505028]Here are the messages that started it:
Iteration: 30271839/50930029, ERROR: Jacobi error check failed! Continuing from last save file. Error reading intermediate file: p9P30029 Renaming p9P30029 to p9P30029.bad1 Trying backup intermediate file: p9P30029.bu Error reading intermediate file: p9P30029.bu Renaming p9P30029.bu to p9P30029.bad2 Trying backup intermediate file: p9P30029.bu2 [/QUOTE] From this, I would assume that the save file [c]p9P30029.bu2[/c] was good, and it silently continued from there without outputting any further messages. [QUOTE]The test turned out to be good, since it was a double check and the residues matched.[/QUOTE] For sure, it must have found a good savefile. There is no way forward from a failed Jacobi check, only backtracking and retrying. The chances of a good result were only "fair" because there could have been earlier errors, even if the Jacobi check passed. |
So if it could not find a good save file, would the program abort the test and start it over?
|
[QUOTE=PhilF;505037]So if it could not find a good save file, would the program abort the test and start it over?[/QUOTE]
Yes |
Ok, thanks. Now I have a better understanding of how a test can report a Jacobi error yet still produce a good result. I also have a better understanding as to the importance of multiple save files. :)
|
[QUOTE=PhilF;505044]Ok, thanks. Now I have a better understanding of how a test can report a Jacobi error yet still produce a good result. I also have a better understanding as to the importance of multiple save files. :)[/QUOTE]
In undoc.txt: [CODE]You can control how many save files are kept that have passed the Jacobi error check. This value is in addition to the value set by the NumBackupFiles setting. So if NumBackupFiles=3 and JacobiBackupFiles=2 then 5 save files are kept - the first three may or may not pass a Jacobi test, the last two save files have passed the Jacobi error check. In prime.txt: JacobiBackupFiles=N (default is 2) [/CODE]Also, to limit the damage even if all the usual are overrun by error, consider n~10,000,000 in the following: [CODE]You can have the program generate save files every n iterations. The files will have a .XXX extension where XXX equals the current iteration divided by n. In prime.txt enter: InterimFiles=n[/CODE]Also: good general computer backup practices (regularly, to different media, checked) I do redundant backup. Cheap USB sticks with daily xcopy/s, in addition to network or separate HD automatic backup. |
You can also decrease the time between Jacobi checks:
[QUOTE]You can control how often Jacobi error checking is run. Default is 12 hours. If a Jacobi test takes 30 seconds, then the default represents an overhead of 30 / (12 * 60 * 60) or 0.07% overhead. Each Jacobi test has a 50% chance of discovering if a hardware error has occured in the last time interval. In prime.txt: JacobiErrorCheckingInterval=N (default is 12) where N is in hours.[/QUOTE] |
prime95 P-1 reporting E?
When no factor is found, the report includes the E value that was used.
When a factor is found, apparently not. (I checked the worker window, results.txt, and prime.log) |
[QUOTE=kriesel;505104]When no factor is found, the report includes the E value that was used. When a factor is found, apparently not[/QUOTE]"E" is only relevant in stage 2. If the factor was found in stage1 then E is not applicable and therefore not reported.[code]P-1 found a factor in stage #1, B1=730000.
M89565797 has a factor: 164493217479527458358561 (P-1, B1=730000) P-1 found a factor in stage #2, B1=730000, B2=14782500, E=12. M89565907 has a factor: 16352015139068430008287498903 (P-1, B1=730000, B2=14782500, E=12)[/code]E will be reported for all no-factor results because both stage1+2 were run. |
1 Attachment(s)
[QUOTE=James Heinrich;505109]"E" is only relevant in stage 2. If the factor was found in stage1 then E is not applicable and therefore not reported.[code]P-1 found a factor in stage #1, B1=730000.
M89565797 has a factor: 164493217479527458358561 (P-1, B1=730000) P-1 found a factor in stage #2, B1=730000, B2=14782500, E=12. M89565907 has a factor: 16352015139068430008287498903 (P-1, B1=730000, B2=14782500, E=12)[/code]E will be reported for all no-factor results because both stage1+2 were run.[/QUOTE] Sorry, I should have specified that I was certain the factor in question without an E value indicated was from completion of stage 2 in prime95 V29.4b8. I've been eagerly watching for its completion, since it's part of a set I'm running to measure run time scaling of P-1 in prime95 for completion through stage 2. |
You only have 4GB available which is probably insufficient to run the extension for that exponent.
Using the approximate values returned by on my P-1 Probability page, I'd expect that you'd need to allocate [url=https://www.mersenne.ca/prob.php?exponent=301000159&b1=2370000&b2=47992500&factorbits=80]at least 8GB[/url] before the extension gets used. I'm not sure about the intricacies involved in how Prime95 selects the number of relative primes (typically 480, sometimes 960 as per your screenshot, sometimes 192) and how it relates to the choice of whether to use the Brent-Suyama extension or not. Perhaps in this case it was better to use 960 and skip the extension. George would be able to better answer those details. |
[QUOTE=James Heinrich;505141]You only have 4GB available which is probably insufficient to run the extension for that exponent.
Using the approximate values returned by on my P-1 Probability page, I'd expect that you'd need to allocate [URL="https://www.mersenne.ca/prob.php?exponent=301000159&b1=2370000&b2=47992500&factorbits=80"]at least 8GB[/URL] before the extension gets used. I'm not sure about the intricacies involved in how Prime95 selects the number of relative primes (typically 480, sometimes 960 as per your screenshot, sometimes 192) and how it relates to the choice of whether to use the Brent-Suyama extension or not. Perhaps in this case it was better to use 960 and skip the extension. George would be able to better answer those details.[/QUOTE] Total installed RAM on the system is 8GB. Paging was excessive at 7.2GB or 6GB; it stopped at 4GB. Which has seemed adequate on gpus. It should be in good shape around 100M though, per your calculator. I guess I'm too accustomed to CUDAPm1 indicating E=12, 6, or 2, even on a 4GB GTX 1050TI up to 383M+ in stage 2, or 0 for a stage 1 result printed to the console, such as for the following samples. Maybe they're trading off bounds and extension differently. M85320343 Stage 1 found no factor (P-1, B1=735000, B2=17272500, e=0, n=4704K CUDAPm1 v0.20) M85320353 Stage 1 found no factor (P-1, B1=735000, B2=17272500, e=6, n=4704K CUDAPm1 v0.20) M85343233 Stage 2 found no factor (P-1, B1=735000, B2=17272500, e=6, n=4704K CUDAPm1 v0.20) M289999981 Stage 1 found no factor (P-1, B1=2280000, B2=53010000, e=0, n=16384K CUDAPm1 v0.20) M289999981 Stage 2 found no factor (P-1, B1=2280000, B2=53010000, e=2, n=16384K CUDAPm1 v0.20) M375000013 Stage 2 found no factor (P-1, B1=3085000, B2=69412500, e=2, n=21168K CUDAPm1 v0.20) M383000063 Stage 2 found no factor (P-1, B1=2930000, B2=63727500, e=2, n=21504K CUDAPm1 v0.20) M425000083 Stage 2 found no factor (P-1, B1=2840000, B2=62000000, e=2, n=24192K CUDAPm1 v0.20) More info on what that GPU could run is shown in one of the attachments at [URL]https://www.mersenneforum.org/showpost.php?p=498673&postcount=9[/URL] |
My above sample results are actually two sequential results I got a few days ago, and used about 38GB to get E=12 on 89M exponents.
|
I have two suggestions for the next version:
1. If the user has a proxy configured, then Prime95 should fall back to a standard connection if the proxy isn't available. Use case: I have a work laptop that I regularly take home. All network connections must go through the proxy when connected to the corporate network. Because the proxy is publicly inaccessible, I have to change the settings in Prime95 every time I take the laptop home. Alternative idea: allow users to configure more proxies to fall back to. 2. I believe this is a known issue, but Prime95 will always try to use more than one core per worker window even when "CPU cores to use" is set to 1 per worker. I have to set [c]CoresPerTest=1[/c] in [c]local.txt[/c] to solve this problem. |
[QUOTE=ixfd64;505557]I have two suggestions for the next version:
1. If the user has a proxy configured, then Prime95 should fall back to a standard connection if the proxy isn't available. Use case: I have a work laptop that I regularly take home. All network connections must go through the proxy when connected to the corporate network. Because the proxy is publicly inaccessible, I have to change the settings in Prime95 every time I take the laptop home. Alternative idea: allow users to configure more proxies to fall back to.[/quote] Done. [quote]2. I believe this is a known issue, but Prime95 will always try to use more than one core per worker window even when "CPU cores to use" is set to 1. I have to set [c]CoresPerTest=1[/c] in [c]local.txt[/c] to solve this problem.[/QUOTE] Please elaborate. Windows/Linux? Brand new install? Your CPU? Steps to reproduce bug? |
Thanks for the response.
The second issue occurs on Windows at least — I haven't tested any other platforms. To reproduce: [LIST=1][*]Set the number of workers = number of physical cores[*]Set the worker number to "All workers"[*]Set the number of CPU cores to 1 per worker[/LIST] When the user tries to save the settings, there will be a message saying Prime95 is using more cores than available. The number of cores per thread will revert to a value > 1 afterwards. |
[QUOTE=ixfd64;505574]
The second issue occurs on Windows at least — I haven't tested any other platforms. To reproduce: [LIST=1][*]Set the number of workers = number of physical cores[*]Set the worker number to "All workers"[*]Set the number of CPU cores to 1 per worker[/LIST] When the user tries to save the settings, there will be a message saying Prime95 is using more cores than available. The number of cores per thread will revert to a value > 1 afterwards.[/QUOTE] I'm doing something wrong. What were the setting prior to entering Test/Worker Windows? |
I don't recall doing anything prior to entering the number of workers. I'll try again the next time I install Prime95 on a new machine.
|
[QUOTE=ATH;495447]From undoc.txt in 29.4b8:
If I use this on AWS EC2 instance on startup it still says "CPU identity mismatch" and "ComputerGUID=" in local.txt changes and it creates another CPU on my account. That way you get tons of CPUs on your account, a new one every time you loose your session and it automatically starts a new one. I'm also wondering what is "HardwareGUID=" in prime.txt compared to "ComputerGUID=" in local.txt? There is also a "WindowsGUID=" in prime.txt but it has no value.[/QUOTE] I think I have a fix for you in 29.5 build 8 (not out yet). To sync up, you *may* need to delete FixedHardwareUID from prime.txt, sync up with primenet, then add FixedHardwareUID back to prime.txt. |
PauseWhileRunning oddness
I've configured an instance of Prime95 service v29.4.7 on a Win2016 server with
[CODE]PauseWhileRunning=* during 1-5/8:00-18:00[/CODE] in [I]prime.txt[/I], with the intention of having P95 to stop during work hours Mon-Fri, and resume crunching at night and on weekends. If I stop/start the service during "off" hours (=in the middle of a workday), everything is fine: a few seconds for Jacobi testing, then it becomes idle until requested - and it also respects my intended schedule for weekends. When, on the other side, I happen to stop/start the service during "on" hours (=at night, or during weekends - that is, when server reboots typically happen), its schedule is no longer respected: the service keeps on crunching all day long, even in its "off" hours - until stopped. A minor annoyance, indeed, now that I've understood its rationale - though I'm not sure whether this is its intended behaviour. Did anyone else experience something similar? |
[QUOTE=ric;507022]I happen to stop/start the service during "on" hours (=at night, or during weekends - that is, when server reboots typically happen), its schedule is no longer respected: the service keeps on crunching all day long, even in its "off" hours - until stopped.
A minor annoyance, indeed, now that I've understood its rationale - though I'm not sure whether this is its intended behaviour. Did anyone else experience something similar?[/QUOTE] I can fix that. Try again in version 29.6. |
Another suggested change: the "Visit Mersenne Wiki" option under the Help menu should be removed as the [url=https://mersenneforum.org/showthread.php?t=23625]wiki has been shut down[/url].
|
[QUOTE=ixfd64;507581]the "Visit Mersenne Wiki" option under the Help menu should be removed as the [url=https://mersenneforum.org/showthread.php?t=23625]wiki has been shut down[/url].[/QUOTE]Although it is apparently coming back, eventually, at [url=http://www.mersennewiki.com/]mersennewiki[b].com[/b][/url] (once Chris finds some spare time).
|
[QUOTE=James Heinrich;507583]Although it is apparently coming back, eventually, at [url=http://www.mersennewiki.com/]mersennewiki[b].com[/b][/url] (once Chris finds some spare time).[/QUOTE][URL="https://www.rieselprime.de/ziki/Main_Page"]Primewiki[/URL] is currently much further along and dealing with several issues that had come up.
|
[QUOTE=Uncwilly;507600][URL="https://www.rieselprime.de/ziki/Main_Page"]Primewiki[/URL] is currently much further along and dealing with several issues that had come up.[/QUOTE]Ooh, didn't know of that. I have updated relevant links to point there, thanks.
|
Skipping interimresidue output at stop/resume
In Prime95 V29.4b8, set InterimResidues=1 in prime.txt and let run.
Stop and continue. It will skip one iteration's interim residue output. There's no output for iteration 32 below. [CODE]... [Feb 10 17:39:31] Iteration: 30 / 82589933 [0.00%], ms/iter: 114.376, ETA: 109d 07:58 [Feb 10 17:39:32] M82589933 interim Wg4 residue E5F364AA0E27B1E5 at iteration 30 [Feb 10 17:39:38] Iteration: 31 / 82589933 [0.00%], ms/iter: 116.014, ETA: 110d 21:32 [Feb 10 17:39:40] M82589933 interim Wg4 residue 9008B9C355C8F005 at iteration 31 [Feb 10 17:39:40] Iteration: 32 / 82589933 [0.00%], ms/iter: 113.969, ETA: 108d 22:37 [Feb 10 17:39:41] Stopping primality test of M82589933 at iteration 32 [0.00%] [Feb 10 17:39:41] Worker stopped. [Feb 10 17:58:21] Worker starting [Feb 10 17:58:21] Setting affinity to run worker on CPU core #2 [Feb 10 17:58:25] Running Jacobi error check. Passed. Time: 73.089 sec. [Feb 10 17:59:37] Resuming primality test of M82589933 using FFT length 4480K, Pass1=896, Pass2=5K, clm=4 [Feb 10 17:59:43] Iteration: 33 / 82589933 [0.00%]. [Feb 10 17:59:44] M82589933 interim Wg4 residue A211573CE86B5929 at iteration 33 [Feb 10 17:59:50] Iteration: 34 / 82589933 [0.00%], ms/iter: 54.171, ETA: 51d 18:46 [Feb 10 17:59:51] M82589933 interim Wg4 residue 24ED4C62B86E20DA at iteration 34 [Feb 10 17:59:51] Iteration: 35 / 82589933 [0.00%], ms/iter: 112.100, ETA: 107d 03:45 [Feb 10 17:59:52] M82589933 interim Wg4 residue C5657DF459CC3CD5 at iteration 35[/CODE]I don't know if it also skips for larger interimresidues values if it happens to get the stop signal at the iteration for which an output should occur. Haven't tried it on any other versions. |
Out of order output
There's a one-iteration stagger in residue output on Prime95 PRP.
[CODE][Feb 10 18:54:24] Worker starting [Feb 10 18:54:24] Setting affinity to run worker on CPU core #2 [Feb 10 18:54:25] Starting Gerbicz error-checking PRP test of M82589933 using FFT length 4480K, Pass1=896, Pass2=5K, clm=4 [Feb 10 18:54:27] Iteration: 1 / 82589933 [0.00%], ms/iter: 1431.427, ETA: 1368d 07:17 [Feb 10 18:54:27] Iteration: 2 / 82589933 [0.00%], ms/iter: 335.534, ETA: 320d 17:42 [Feb 10 18:54:28] M82589933 interim Wg4 residue 000000000000001B at iteration 1 [Feb 10 18:54:29] Iteration: 3 / 82589933 [0.00%], ms/iter: 346.241, ETA: 330d 23:20 [Feb 10 18:54:30] M82589933 interim Wg4 residue 000000000000088B at iteration 2 [Feb 10 18:54:36] Iteration: 4 / 82589933 [0.00%], ms/iter: 334.144, ETA: 319d 09:48 [Feb 10 18:54:38] M82589933 interim Wg4 residue 0000000000DAF26B at iteration 3 [Feb 10 18:54:38] Iteration: 5 / 82589933 [0.00%], ms/iter: 358.347, ETA: 342d 13:04 [Feb 10 18:54:39] M82589933 interim Wg4 residue 000231C54B5F6A2B at iteration 4 [Feb 10 18:54:40] Iteration: 6 / 82589933 [0.00%], ms/iter: 330.110, ETA: 315d 13:16 [Feb 10 18:54:41] M82589933 interim Wg4 residue D310B7D97DD4E9AB at iteration 5 [Feb 10 18:54:47] Iteration: 7 / 82589933 [0.00%], ms/iter: 342.891, ETA: 327d 18:28 [Feb 10 18:54:48] M82589933 interim Wg4 residue 2AC0B180838228AB at iteration 6 [Feb 10 18:54:49] Iteration: 8 / 82589933 [0.00%], ms/iter: 336.957, ETA: 322d 02:20 [Feb 10 18:54:50] M82589933 interim Wg4 residue 9B5ACA650265A6AB at iteration 7 [Feb 10 18:54:50] Iteration: 9 / 82589933 [0.00%], ms/iter: 335.743, ETA: 320d 22:30 [Feb 10 18:54:52] M82589933 interim Wg4 residue B47759B0D250A2AB at iteration 8 [Feb 10 18:54:58] Iteration: 10 / 82589933 [0.00%], ms/iter: 348.099, ETA: 332d 17:58 [Feb 10 18:54:59] M82589933 interim Wg4 residue DF36E033DAB69AAB at iteration 9[/CODE] |
In version 29.4 build 8, I rebooted a machine that had interim residue waiting to send. After the reboot, the residue was sent and the test then resumed immediately, meaning the default BootDelay of 90 seconds was not honored.
|
No FFT lengths available in the range specified
The latest Prime95 version that work for me is "[URL="http://www.mersenne.org/ftp_root/gimps/p95v294b5.win64.zip"]Prime95 29.5 B5[/URL]". Any version after "29.5 B5", till "[URL="http://www.mersenne.org/ftp_root/gimps/p95v296b1.win64.zip"]29.6 Build 1[/URL]" bring this message in the log, under all four CPU cores - "No FFT lengths available in the range specified". Whether my CPU is not supported anymore, bug or something else?
CPU: AMD Athlon X4 880K OS: Windows 10 Enterprise x64, Version 1809 (OS Build 17763.316) |
[QUOTE=primesearcher;508912] Whether my CPU is not supported anymore, bug or something else?
CPU: AMD Athlon X4 880K OS: Windows 10 Enterprise x64, Version 1809 (OS Build 17763.316)[/QUOTE] Bug. In 29.5b5 or 29.6 what do you see in the main window at startup. Somthing like "optimizing for ....... architecture". In Options/CPU, what features and cache sizes are reported? Which torture test are you running or are you running benchmarks? What settings are you using in the dialog box? |
AMD 880K No FFT lengths
[QUOTE=Prime95;508915]Bug.
In 29.5b5 or 29.6 what do you see in the main window at startup. Somthing like "optimizing for ....... architecture". In Options/CPU, what features and cache sizes are reported? Which torture test are you running or are you running benchmarks? What settings are you using in the dialog box?[/QUOTE] I see Optimizing for "AMD Bulldozer, L2 cache size: 2x2 MB" [URL="http://www.img-share.eu/f/images/588/OptimizingeflK9dZ8VvU3Dc1jineA1550542878.png"]attached image[/URL] I run everyting on default setting - "Just Stress Testing-->Blend(all of the above)" [URL="http://www.img-share.eu/f/images/588/TestXWyilgSeyyGlLlY7Ntu21550542879.png"]attached image[/URL] In Options/CPU [URL="http://www.img-share.eu/f/images/588/Options-CPUjTn1tFbMaYtyP6nD7E6H1550542879.png"]attached image[/URL] What I found now is, when enable check box "Disable AVX", everything run smooth with no errors. [URL="http://www.img-share.eu/f/images/588/Disable-AVXEyL3BWsx3ZWhytlqROAM1550543323.png"]attached image[/URL] |
Fixed in 29.6 build 2
Note that for the first time ever Bulldozer users can torture test using FMA3 and AVX FFTs. However, you may find the SSE2 FFTs are more stressful. |
debian build source p95v294b8.linux64
I downloaded the linux64 executable and it runs ok.
I built from source and getting error messages: [Comm thread Feb 22 21:09] <= Recv header: X-Powered-By: PHP/7.1.10 [Comm thread Feb 22 21:09] <= Recv header: Date: Sat, 23 Feb 2019 03:09:34 GMT [Comm thread Feb 22 21:09] <= Recv header: Content-Length: 84 [Comm thread Feb 22 21:09] <= Recv header: [Comm thread Feb 22 21:09] <= Recv data: pnErrorResult=7 [Comm thread Feb 22 21:09] pnErrorDetail=parameter ss: Invalid int value/precision '' [Comm thread Feb 22 21:09] ==END== [Comm thread Feb 22 21:09] [Comm thread Feb 22 21:09] == Info: Connection #0 to host v5.mersenne.org left intact [Comm thread Feb 22 21:09] RESPONSE: [Comm thread Feb 22 21:09] pnErrorResult=7 [Comm thread Feb 22 21:09] pnErrorDetail=parameter ss: Invalid int value/precision '' [Comm thread Feb 22 21:09] ==END== [Comm thread Feb 22 21:09] [Comm thread Feb 22 21:09] PrimeNet error 7: Invalid parameter [Comm thread Feb 22 21:09] parameter ss: Invalid int value/precision '' [Comm thread Feb 22 21:09] Visit [url]http://mersenneforum.org[/url] for help. [Comm thread Feb 22 21:09] Will try contacting server again in 70 minutes. |
status estimating duration of work prime95 can not run
The following worker window contents appeared on an i7-7500U system (Win10 Home, 8GB ram) running prime95 V29.4b8 x64.
The software would give completion time estimates in File, Status, but could not run the P-1 assignment that it estimated time for. Maybe some build of prime95 could check feasibility of a manually inserted assignment and warn the user, during status output, before it comes up as the current work, and stalls a worker? This system previously completed a 503M P-1 run. The Intel spec sheet indicates this processor includes AVX2. [URL]https://ark.intel.com/content/www/us/en/ark/products/95451/intel-core-i7-7500u-processor-4m-cache-up-to-3-50-ghz.html[/URL] Prime95 indicates SSE, SSE2, SSE4, AVX, AVX2, FMA. [CODE] [Mar 16 12:59:16] Worker starting [Mar 16 12:59:16] Setting affinity to run worker on CPU core #1 [Mar 16 12:59:16] Optimal P-1 factoring of M605000003 using up to 4096MB of memory. [Mar 16 12:59:16] Assuming no factors below 2^84 and 2 primality tests saved if a factor is found. [Mar 16 12:59:19] Optimal bounds are B1=4100000, B2=66625000 [Mar 16 12:59:19] Chance of finding a factor is an estimated 3.14% [Mar 16 12:59:19] Cannot initialize FFT code, errcode=1002 [Mar 16 12:59:19] Worker stopped. [Mar 20 11:31:42] Worker starting [Mar 20 11:31:42] Setting affinity to run worker on CPU core #1 [Mar 20 11:31:42] Optimal P-1 factoring of M605000003 using up to 4096MB of memory. [Mar 20 11:31:42] Assuming no factors below 2^84 and 2 primality tests saved if a factor is found. [Mar 20 11:31:43] Optimal bounds are B1=4100000, B2=66625000 [Mar 20 11:31:43] Chance of finding a factor is an estimated 3.14% [Mar 20 11:31:43] Cannot initialize FFT code, errcode=1002 [Mar 20 11:31:43] Worker stopped.[/CODE]The same fft error code appears on an i7-8750H system (Win10 Home, 16GB ram, 8GB available to prime95) on prime95 V29.4b8 or v29.6b7. [URL]https://ark.intel.com/content/www/us/en/ark/products/134906/intel-core-i7-8750h-processor-9m-cache-up-to-4-10-ghz.html[/URL] From prime95 V29.6b7 on i7-8750H: [CODE]Mar 20 12:30] Worker starting [Mar 20 12:30] Setting affinity to run worker on CPU core #1 [Mar 20 12:30] Optimal P-1 factoring of M605000003 using up to 8MB of memory. [Mar 20 12:30] Assuming no factors below 2^84 and 2 primality tests saved if a factor is found. [Mar 20 12:30] Optimal bounds are B1=5720000, B2=5720000 [Mar 20 12:30] Chance of finding a factor is an estimated 1.83% [Mar 20 12:30] Cannot initialize FFT code, errcode=1002 [Mar 20 12:30] Worker stopped.[/CODE]Prime95 indicates SSE, SSE2, SSE4, AVX, AVX2, FMA. |
The 8750H is configured for 8MB memory, not 8GB.
|
1 Attachment(s)
[QUOTE=PhilF;511271]The 8750H is configured for 8MB memory, not 8GB.[/QUOTE]
Nope. Well, upon review it looks like I missed that adjustment in the one case, i7-8750H v29.6b7 test, but had set 8192MB in the 29.4b8 trial on the same system, and they both failed the same way. Max system ram would be 32GB on the system involved. Mine came with 16GB installed. I changed the prime95 settings from the puny 8MB default to 8192MB (half of installed RAM) in preparation for running P-1 on midrange exponents (~300-500M). P-1 benefits from a lot of memory. The default 8MB starves it. Even at 1GB vs 2GB, or 4 to 8GB, on a gpu, I see benefit. I have another system with the prime95 limit set at 32GB. It helps stage 2 considerably. Here's the i7-8750H V29.6b7 x64 prime95 run twice more, with 8MB and 8192MB allowed for P-1; bounds change, probability changes, error code does not. It just can't run the exponent/ fft length, and memory allocation doesn't matter. [CODE][Mar 20 17:39] Worker starting [Mar 20 17:39] Setting affinity to run worker on CPU core #1 [Mar 20 17:39] Optimal P-1 factoring of M605000003 using up to 8MB of memory. [Mar 20 17:39] Assuming no factors below 2^84 and 2 primality tests saved if a factor is found. [Mar 20 17:39] Optimal bounds are B1=5720000, B2=5720000 [Mar 20 17:39] Chance of finding a factor is an estimated 1.83% [Mar 20 17:39] Cannot initialize FFT code, errcode=1002 [Mar 20 17:39] Worker stopped. [Mar 20 17:40] Worker starting [Mar 20 17:40] Setting affinity to run worker on CPU core #1 [Mar 20 17:40] Optimal P-1 factoring of M605000003 using up to 8192MB of memory. [Mar 20 17:40] Assuming no factors below 2^84 and 2 primality tests saved if a factor is found. [Mar 20 17:40] Optimal bounds are B1=4290000, B2=86872500 [Mar 20 17:40] Chance of finding a factor is an estimated 3.31% [Mar 20 17:40] Cannot initialize FFT code, errcode=1002 [Mar 20 17:40] Worker stopped. [/CODE] |
Are the Affinity settings in local.txt honored when doing benchmarks?
|
[QUOTE=PhilF;513795]Are the Affinity settings in local.txt honored when doing benchmarks?[/QUOTE]
I don't remember. Try AffinityVerbosityBench=1 in prime.txt. |
Un-sticky?
V29.4 has been superseded by a few versions, and 29.8b6 has been out a while. At what point does a thread like this one get unstickied?
|
[QUOTE=kriesel;537641]V29.4 has been superseded by a few versions, and 29.8b6 has been out a while. At what point does a thread like this one get unstickied?[/QUOTE]
I don't see a reason we should stick both threads and have unstuck this one. |
[QUOTE=ixfd64;537662]I don't see a reason we should stick both threads and have unstuck this one.[/QUOTE]
Sticky the important stuff, like current versions and mirror location. But not versions that are years out of date, that would over time make too much clutter at the top of the thread list. |
| All times are UTC. The time now is 17:51. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.