mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   LLR Version 3.8.15 released (https://www.mersenneforum.org/showthread.php?t=20217)

rogue 2015-06-04 14:30

[QUOTE=Jean Penné;403474]I will add this in the next release, but need help to know what special character(s) most be sent to the console,
the magic : OutputStr("\033[7m"); works on Linux, but does not work on Windows...

Regards,
Jean[/QUOTE]

Check here: [url]http://stackoverflow.com/questions/9203362/c-color-text-in-terminal-applications-in-windows[/url]. You would have to #ifdef around it.

Jean Penné 2015-06-04 19:48

Thank you for this help!
 
[QUOTE=rogue;403492]Check here: [url]http://stackoverflow.com/questions/9203362/c-color-text-in-terminal-applications-in-windows[/url]. You would have to #ifdef around it.[/QUOTE]

Thank you, Mark! The textcolor() function does not exist on my Windows platform, but I can use SetConsoleTextAttribute().
Regards,
Jean

Batalov 2015-07-28 00:25

I found a range of test cases where the LLR residue is dependent on the FFT size chosen. (This is on AVX hardware.)
The particular cases are (at least) these:
[CODE]831*2^5772880+1
831*2^5775472+1
831*2^5779912+1
831*2^5780008+1
831*2^5780320+1
831*2^5780560+1
831*2^5780584+1
831*2^5781832+1[/CODE]
I've run them with 384K and 400K, and will now re-run with the next size(s) (FFT_Increment=2 (and 3; that would be all-complex AVX FFT length 480K and 512K) and ErrorCheck=1 in llr.ini)

It is possible that the FFT boundary has to be pushed back more aggressively in the code. Restarts after a "Disregard last error. Result is reproducible and thus not a hardware problem" message lead to mismatched residues, even though one would expect that the "For added safety, redoing iteration using a slower, more reliable method" takes care of it and they should match.

This may need some debugging. Maybe I'd prefer to stick to PRP testing for a while, leaving N-1 for an additional pass.

Jean Penné 2015-08-02 14:31

[QUOTE=stream;402139]Jean, thank you for update. Unfortunately, I afraid that we're moving in wrong direction. The change which you've suggested and implemented may work for PrimeGrid, but I'd rather call it not a solution, but a workaround.

I'll post a summary of the problem once again here. I would be very appreciated to hear opinions of other developers, how this situation is handled in their programs.

The problem which affected PrimeGrid seems to be in LLR for ages. It's in the logic how LLR behaves with ErrorCheck=1 option (so why PG does not using ErrorCheck). First time it was described in [URL="http://www.primegrid.com/forum_thread.php?id=6269&nowrap=true#85444"]this post[/URL]. The scenario is following.[INDENT] 1) LLR have following piece of code:[/INDENT][code]if (error_count > MAX_ERROR_COUNT) {\
OutputBoth (ERRMSG9);\
care = TRUE;\
}\[/code][INDENT]where MAX_ERROR_COUNT=5.[/INDENT][INDENT] 2) When a "repeatable" roundoff error (>0.4) occurs (i.e. this is not a hardware fault, but an unlucky FFT size), after fixing the error itself with few "careful" iterations, an error_count is incremented, and nothing else is done.
[/INDENT][INDENT] 3) It's possible that very long (SoB) unlucky candidate will generate more then 5 RoundOff errors. According to the condition above, a "care" flag is set, and, as far as I understand, ALL operations from now on will be done using "careful" functions. This leads to deadly effects - according to the original post, "The result was that LLR was going to take 11.3 days instead of 2.7 days.". This is completely unacceptable for PrimeGrid due to return timeouts, possible complains about credits, etc.
[/INDENT]This leads to a question: how this situation (multiple "soft" RoundOffs per candidate) are handled in other programs? I see two trivial solutions which will be compatible with PG:
[LIST=1][*]No need to handle this case at all. The code above must be completely removed. Why this magic number of 5 errors appeared at all? Since this is not a hardware problem, let's continue to crunch and recover (if necessary) in usual way.[*]Restart with FFT_Increment. Details of implementation could be discussed (should it restart on first error? 2? 3? 5?). This is good because it'll be no future errors and restarts, and acceptable for PG because FFT_Increment is really not so time-consuming (from original post - "It took an hour or two longer than with the original FFT size" - with 2.7 days total).[/LIST]Any comments are appreciated.

Roman[/QUOTE]

Hi Roman,

I agree completely with your remarks, and take them in account in the LLR 3.8.16 Version that I am just ready to release now.
- The code you reproduced above is removed. The "care" variable is now only used to signal that the last iteration was a careful one...
- The new code for Roundoff error recovery is as described in the Readme.txt file :
- Error checking is done on the first and last 50 iterations, before writing an intermediate file (either user-requested stop or a
30 minute interval expired), and every 128th iteration.
- After an excessive (> 0.40) and reproducible round off error, the iteration is redone using a slower and more reliable method.
- If this error was not reproducible, or if the iteration fails again, the test is restarted from the last save file, using the next larger FFT length...
All that was in previous LLR versions, but :
- Continuing the test using the next FFT length is also forced if too much errors were encountered (MAX_ERROR_COUNT=5).
On the other part, two related features present in V 3.8.15 are maintened :
- To improve reliability, error checking may now be forced, if the program is working near the current FFT limit. This feature may be adjusted by using the option -oPercentFFTLimit=dd.d, the default value beeing 0.5 ; note that setting echk to one is generally not much time consuming : typically 5% more.
Also, this feature may be wiped out by setting PercentFFTLimit to 0.0!

- For those wo do not like to force error checking, I implemented a new option : -oNextFFTifNearLimit=1 (default is zero). If activated, and if the default FFT length at setting is too near the limit, then, FFT_Increment is incremented by one, a message is displayed and the test is immediatly restarted ; indeed, in this case, echk can no more be forced...

I hope this would be satisfactory for the PG community!

Regards,
Jean

ATH 2015-08-02 23:33

[QUOTE=Batalov;406682]I found a range of test cases where the LLR residue is dependent on the FFT size chosen. (This is on AVX hardware.)[/QUOTE]

I tried the first one 831*2^5772880+1 in Prime95 with ErrorCheck=1 and SumInputsErrorCheck=1. It chose the default 384K FFT but right away it had errors:
Iteration: 19053/5772889, POSSIBLE ERROR: ROUND OFF (0.40625) > 0.40
Continuing from last save file.
Iteration: 49878/5772889, POSSIBLE ERROR: ROUND OFF (0.4375) > 0.40
Continuing from last save file.

So I restarted with forced FFT, the next one I could use was 480K, and it went fine with highest round off error at 0.00878:
831*2^5772880+1 is not prime. RES64: 5B74AAB763C326C8. We4: 9BDEBA17,00000000

pepi37 2015-08-03 21:09

[QUOTE]
- For those wo do not like to force error checking, I implemented a new option : -oNextFFTifNearLimit=1 (default is zero). If activated, and if the default FFT length at setting is too near the limit, then, FFT_Increment is incremented by one, a message is displayed and the test is immediatly restarted ; indeed, in this case, echk can no more be forced...
Regards,
Jean[/QUOTE]

Jean I am little confused. You wrote this yesterday, and that is implemented in 3.8.15 from 7.5 this year.
So in fact what new option that is implemented? Or option is same just works in different way?
Thanks for reply

Jean Penné 2015-08-04 04:51

[QUOTE=pepi37;407192]Jean I am little confused. You wrote this yesterday, and that is implemented in 3.8.15 from 7.5 this year.
So in fact what new option that is implemented? Or option is same just works in different way?
Thanks for reply[/QUOTE]
You are right! No new option implemented, but all the Roundoff error code has been rewritten, and tested intensively using ridiculous MaxRoundoff values of 0.1 or less...
Regards,
Jean


All times are UTC. The time now is 06:27.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.