![]() |
|
|
#144 |
|
Romulan Interpreter
Jun 2011
Thailand
965710 Posts |
No, we can not conclude that. Our impression is that the tool works correctly, and very nice. The list of the results in fact proved that, in a way. For the occasional mismatches, they mean nothing. The original residues may be wrong, or the hardware may have very seldom failures. That is why we need the random shifting implemented.
Last fiddled with by LaurV on 2017-05-23 at 05:40 |
|
|
|
|
|
#145 | |
|
"David"
Jul 2015
Ohio
11·47 Posts |
Quote:
|
|
|
|
|
|
|
#146 |
|
Serpentine Vermin Jar
Jul 2014
3,313 Posts |
I've been trying to keep up with triple-checking the mismatches returned by AirSquirrels lately, although some may be further back in my queue. Right now my queue of work is only about a week out so any other assignments I have now should be done somewhat soon.
|
|
|
|
|
|
#147 | |
|
"Mihai Preda"
Apr 2015
3×457 Posts |
Quote:
The randomly selected offset is printed when a new exponent is started, e.g. Code:
LL FFT 4096K (1024*2048*2) of 60000757 (14.31 bits/word) offset 45732555 iteration 1120000 There may be bugs, as usual, with this new feature. The perf impact is about 0.5%. The offset can be "forced" to a given value with command line flag -offset <value> (will take effect on a new exponent). |
|
|
|
|
|
|
#148 | |
|
"Mihai Preda"
Apr 2015
137110 Posts |
Quote:
|
|
|
|
|
|
|
#149 | |
|
∂2ω=0
Sep 2002
República de California
19·613 Posts |
Quote:
2. The most obvious kind of bug here is the sort which bit George way back when in v17 of prime95 - IIRC he neglected to cast the shift value to 64-bit before doing some operation on it (maybe read-initial-shift-value-for-the-run-from-savefile and compute the resulting shift for the current iteration?) which needed an intermediate value to be computed at double the 32-bit width. If you simply write the current shift value to the checkpoint file that shouldn't be an issue, since you only deal with modular doublings on each iteration. |
|
|
|
|
|
|
#150 | ||
|
"Mihai Preda"
Apr 2015
3×457 Posts |
Quote:
I do attempt to compute the Res64 correctly. In fact this is checked with -selftest, that the random offset does not affect residues. Quote:
OTOH what I save in the checkpoint file is only the "initial" offset, not the running offset. The initial offset is needed for writing to the results file, thus has to be saved anyway. On checkpoint load, a modular exponentiation is done to find the "offset at current iteration". |
||
|
|
|
|
|
#151 | |
|
"Mr. Meeseeks"
Jan 2012
California, USA
23·271 Posts |
Quote:
I'm playing around with v0.3 atm.. i'm getting 5.15ms/iter compared to 5ms/iter from v0.2 without offset |
|
|
|
|
|
|
#152 | |
|
Serpentine Vermin Jar
Jul 2014
1100111100012 Posts |
Quote:
If the shift was smaller than the exponent (do I have that right?) it would cause a problem. It was rare, especially once the exponent sizes got larger, but we did find a few cases where it was an issue. Specifically I noticed it when doing triple checks of every exponent below 3M or whatever, and the shift count in some cases was smaller than 3e6 so I was getting residues that didn't match. It might not apply to your algorithm... I forget what the exact problem was (if I ever even knew the details)
|
|
|
|
|
|
|
#153 | |
|
∂2ω=0
Sep 2002
República de California
101101011111112 Posts |
Quote:
|
|
|
|
|
|
|
#154 |
|
Romulan Interpreter
Jun 2011
Thailand
32×29×37 Posts |
1. No change needed! We are good. Thanks a billion. No reason to change the shift in the middle of the work (contrarily, I see that as detrimental for the succees of testing, some guy may run a single test 99%, then split in two, change the offset for one, report both LL+DC, for credit reasons, or whatever).
Only some more tests needed before we say if the shift works as expected. 2. The shift should never be larger than the exponent (as Ernst says, it is just a rotation of a value with a single bit set). In fact, we even do not need all the range of p bits, just few different starting points to give the FFT different data to play with, when LL and when DC. If it is more convenient for you to limit the initial shift to 16 or even 8 bits, than do so
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1676 | 2021-06-30 21:23 |
| GPUOWL AMD Windows OpenCL issues | xx005fs | GpuOwl | 0 | 2019-07-26 21:37 |
| Testing an expression for primality | 1260 | Software | 17 | 2015-08-28 01:35 |
| Testing Mersenne cofactors for primality? | CRGreathouse | Computer Science & Computational Number Theory | 18 | 2013-06-08 19:12 |
| Primality-testing program with multiple types of moduli (PFGW-related) | Unregistered | Information & Answers | 4 | 2006-10-04 22:38 |