![]() |
Win 8.1, i7-3930K, GTX1080, gpuowl 0.5
Doing a self-test I got an error -9999, and then the exe crashed. [QUOTE]LL FFT 4096K (1024*2048*2) of 77002949 (18.36 bits/word) offset 2000 iteration 0 error -9999[/QUOTE] Any ideas? I couldn't find that error in this thread using a quick search. |
I'm not sure about this, but after looking at the sourcecode on github in
clwrap.h line #11-12 [CODE] #define CHECK(err) { int e = err; if (e != CL_SUCCESS) { fprintf(stderr, "error %d\n", e); assert(false); }} #define CHECK2(err, mes) { int e = err; if (e != CL_SUCCESS) { fprintf(stderr, "error %d (%s)\n", e, mes); assert(false); }}[/CODE]these must the lines writing out the warning. (searched all the sources searching for printing out "error ..." about CL_SUCCESS [URL]https://www.khronos.org/registry/OpenCL/sdk/1.1/docs/man/xhtml/clGetEventInfo.html[/URL], so -9999 is an error code/value. Opencl error. [URL]https://streamhpc.com/blog/2013-04-28/opencl-error-codes/[/URL] [CODE]Errors thrown by Vendors Code Vendor Function(s) Description -9999 NVidia clEnqueueNDRangeKernel Illegal read or write to a buffer [/CODE] |
Vega Timings, had to use legacy kernels or all 0xfffff... residues. This was done on a stack known to still be pretty raw and not optimized (rocm: 1.6.127);
(Testing on 77002949) [B]RX64 Air (1630Mhz, but was down at 1536 quickly due to heat)[/B] ms/iter: 2.048 [B]FE Liquid (1600Mhz) :[/B] ms/iter: 1.898 Similar to Fury at present. |
Thank you David for the info. Yes I can repro this now (just got my RX Vega moments ago). The "-2" residue is clearly a problem, I'll investigate.
|
Updated timings from the Windows driver (Which I'm told is still using the previous closed compiler, while Linux has transitioned to LLVM/rocm - apparently regrettably)
RX64 Air: ms/iter: 1.639 FE: Unknown, won't recognize with the same same driver as RX. This is with stock clocks. If I use the "stable" 1000Mhz memory clocks that pass self test I get 1.600 Both kernels work as expected in Windows, so likely an llvm regression. |
[QUOTE=airsquirrels;465777]Updated timings from the Windows driver (Which I'm told is still using the previous closed compiler, while Linux has transitioned to LLVM/rocm - apparently regrettably)
RX64 Air: ms/iter: 1.639 FE: Unknown, won't recognize with the same same driver as RX. This is with stock clocks. If I use the "stable" 1000Mhz memory clocks that pass self test I get 1.600 Both kernels work as expected in Windows, so likely an llvm regression.[/QUOTE] I've been investigating this for a while now on Linux. I'm mystified by the behavior I observe. Clearly is something related to memory/cache and timing. But I can't track this down to a cause in my code; frustrating. |
[QUOTE=preda;466027]I've been investigating this for a while now on Linux. I'm mystified by the behavior I observe. Clearly is something related to memory/cache and timing. But I can't track this down to a cause in my code; frustrating.[/QUOTE]
I saw your post over on ROCm. I’ve seen numerous regressions since this switch to llvm from the older proprietary compiler. In my opinion switching amdgpu-pro to ROCm was premature and I’m disappointed that they essentially moved a beta compiler to the only option in their production driver. Clearly they have the compiler code for Vega - windows uses it. |
1 Attachment(s)
Windows binary from the latest commit(676be1c).
[B]NOTE: Untested![/B] |
[QUOTE=kracker;466482]Windows binary from the latest commit(676be1c).
[B]NOTE: Untested![/B][/QUOTE] I would hold on a bit before using 1.0, it's still work in progress and I'd like to do more testing myself. Also it's not backwards compatible with savefiles. PS: I'll make a post with a summary of the changes when it's good to go. |
gpuOwl 1.0
[QUOTE=preda;466484]I'll make a post with a summary of the changes when it's good to go.[/QUOTE]
As promised, here's the note with recent changes to gpuOwl: gpuOwl 1.0 ( [url]https://github.com/preda/gpuowl[/url] ) The "1.0" is not an indication of stability or bug-free, but only of radical changes from the previous version (0.6) -- so expect the usual amount of bugs (or more) and rough edges. gpuOwl does not implement LL (Lucas-Lehmer primality test) anymore. Instead, it does the "PRP-3" probable prime test, which is: For M = 2^p-1, compute R := 3^(p-1) mod M. If R == -3 (mod M), then M is a probable prime. In this situation, an LL test should be run to verify that the *probable* prime is actually prime. (in practice though, somebody lucky to find a PRP-3 probable mersenne prime should probably simply claim a prime, and the community would do the required LL validation). The change to PRP-3 is motivated by a nice self-validating side computation described by Robert Gerbicz [url]http://www.mersenneforum.org/showpost.php?p=465431&postcount=88[/url] . This affords confidence in the correctness of the result, with small additional cost. Other main changes: - the residue computed is not an LL residue anymore. It should not be compared or reported/mixed with LL residues. To facilitate the distinction, the new residue looks different, e.g. "P3-0223c56fae4cfdef6b" (you may notice it's a bit longer too, this is because the last 2 hex-digits are check-digits). - while the end-residue for an LL prime is 0, the end-residue for this PRP-3 "probable prime" is "-3", which shows up as 0xfffffffffffffffc. Not as nice as 0, I agree :). E.g. P3-fffffffffffffffc0d with the checkdigits. - all LL error protections have been removed: loop detection, Jacobi-check, rounding-error, etc. (as the new check is stronger). 1.0 looks OK to me for general use now. Looking for bug reports & feedback. PS: on RX Vega, use "-legacy" flag. |
gpuOwl 1.0 does not load savefiles from previous versions. (PRP does not load/continue LL).
So please keep with the LL version until the ongoing exponent is done, then switch and start from the beginning of a new exponent. |
| All times are UTC. The time now is 07:02. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.