![]() |
|
|
#1167 | |
|
Nov 2010
Germany
3×199 Posts |
Quote:
It must be a different rounding that leads to a higher-than-expected error. I cannot reproduced this error on my H/W. Can you reproduce it? The factor is not particularly close to the limit of this kernel, the exponent does not have a long suite of ones in its binary form ... I don't see why this test should fail. I'll provide a special test version to you to be able to analyze this failure. |
|
|
|
|
|
|
#1168 |
|
"Mr. Meeseeks"
Jan 2012
California, USA
23×271 Posts |
Yep, just ran part of -st2 again on my HD4600. Same failure.
|
|
|
|
|
|
#1169 | |
|
Jun 2010
Pennsylvania
2·467 Posts |
Quote:
Can I enter, say "45000000" to "47000000", or do I have to enter specific starting/ending exponents? (I was able to get manual assignments in the normal manner if no range was indicated.) Rodrigo |
|
|
|
|
|
|
#1170 |
|
Romulan Interpreter
Jun 2011
Thailand
3×3,221 Posts |
|
|
|
|
|
|
#1171 | |
|
If I May
"Chris Halsall"
Sep 2002
Barbados
100110001101102 Posts |
Quote:
But, such work is easily available from GPU72 -- both low DCTF'ing and low LLTF'ing. Please note that LLTF'ing is the most needed at the moment, and the deeper the better. But for those whose cards are more optimal going to lower bit levels are encouraged to do so. |
|
|
|
|
|
|
#1172 | |
|
Jun 2010
Pennsylvania
3A616 Posts |
Thanks, LaurV and Chris.
I guess I got sidetracked by this part: Quote:
![]() Rodrigo |
|
|
|
|
|
|
#1173 | |
|
Jun 2005
USA, IL
193 Posts |
I get the same failure on my HD2500.
Quote:
|
|
|
|
|
|
|
#1174 |
|
Nov 2010
Germany
11258 Posts |
Thanks for the confirmation.
I'm on it to troubleshoot this with kracker, but don't have a enough time to make good progress on it. The next step would be to build a version where tracing will show where exactly calculations go wrong ... |
|
|
|
|
|
#1175 | |
|
Dec 2012
2×139 Posts |
Quote:
Feel free to say no or to put it at the end of your to-do list. I can stick with the GPU sieve for a while longer. I seem to be the only one wanting it, and I don't expect you to go out of your way or anything.
Last fiddled with by Jayder on 2014-08-11 at 13:44 |
|
|
|
|
|
|
#1176 | |
|
Nov 2010
Germany
25516 Posts |
Quote:
Just a quick update about the HD4600/2500 selftest failure: I analyzed kracker's data and the code. The reason is that the HD4600 has a slightly different rounding behavior. I noticed that also for AMD devices the code walks dangerously close to the border of the available precision. Even though all tests succeed, the warning lights that Oliver once built into the code (CHECKS_MODBASECASE) do light up in the 15_82/15_83 and 15_88 kernels. In order to fix that I finally did a long-waiting attempt: base the initial division on double instead of float. This allows for doing the div_180_90 in two instead of five steps with only one instead of four big multiplications in between. Result: no more CHECKS_MODBASECASE issues (plenty of safety bits), and 1.5% faster overall (on HD7950), even though processing speed for doubles is just 4:1. I will run a few tests over night and then probably send out 0.15pre2 for testing - it should at least fix the IntelHD issue. I'll then test if the smaller kernels would also benefit from using doubles, and how the performance looks like on mid- and lower end GPUs where the performance for doubles is just 16:1. Well, maybe after my vacation
|
|
|
|
|
|
|
#1177 |
|
Nov 2010
Germany
3×199 Posts |
Dear mfakto-testers,
I now put the windows/64 version of mfakto-0.15pre2 to the ftp. I'd appreciate if you could test it on the various systems you have access to:
Additional performance-testing: As the new division algorithm is based on double precision, I'd need to get performance results from as many different devices as possible:
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| gpuOwL: an OpenCL program for Mersenne primality testing | preda | GpuOwl | 2719 | 2021-08-05 22:43 |
| mfaktc: a CUDA program for Mersenne prefactoring | TheJudger | GPU Computing | 3497 | 2021-06-05 12:27 |
| LL with OpenCL | msft | GPU Computing | 433 | 2019-06-23 21:11 |
| OpenCL for FPGAs | TObject | GPU Computing | 2 | 2013-10-12 21:09 |
| Program to TF Mersenne numbers with more than 1 sextillion digits? | Stargate38 | Factoring | 24 | 2011-11-03 00:34 |