![]() |
[QUOTE=kriesel;525123]See [M]100002337[/M]. Factor was found in stage 1. B1 and B2 were included in the report. It is customary to report B1 and B2 as the B1 value when the factor is found in stage 1 and stage 2 is not fully performed. That way, the recorded bounds reflect actual factoring limits completed, and processing credit is properly computed, not overestimated.[/QUOTE]
OK, I think I fixed this in a recent commit: if a factor is found in stage1, do not include B2 in the report. |
[QUOTE=preda;525076]Hi, I just realized that P-1 stage2 in GpuOwl was broken since v6.5-51-gefc3c9f[/QUOTE]
Preda is being kind in not mentioning this is my fault. I made several small optimizations in gpuowl and not knowing the code very well assumed my Gerbicz PRP testing would catch any bugs. However, I made a typo in a code path only used by P-1. Sorry about that. |
[QUOTE=kriesel;525113]The news is not as good on NVIDIA. This was an attempt on a GTX 1080 Ti.
[/QUOTE] I was using an AMD-specific extension to get the amount of free RAM on the GPU. I updated the code to also work when said extension is not available. For such a situation (no free GPU RAM info) I added a flag which specifies a limit on the GPU RAM GpuOwl can use: -maxAlloc <size in MB> Feel free to try again on NVIDIA, to see what problem we hit next. BTW, -maxAlloc also works in general, hard-limiting the amount of GPU memory used by a GpuOwl instance. |
memory query etc.
[QUOTE=preda;525155]I was using an AMD-specific extension to get the amount of free RAM on the GPU. I updated the code to also work when said extension is not available. For such a situation (no free GPU RAM info) I added a flag which specifies a limit on the GPU RAM GpuOwl can use:
-maxAlloc <size in MB> Feel free to try again on NVIDIA, to see what problem we hit next. BTW, -maxAlloc also works in general, hard-limiting the amount of GPU memory used by a GpuOwl instance.[/QUOTE]Will give it a try. It's very surprising that querying available memory is available in OpenGL but not a standard part of OpenCL. Seems like a glaring omission for performance. [URL]https://stackoverflow.com/questions/3568115/how-do-i-determine-available-device-memory-in-opencl[/URL] GPU-Z and nvidia-smi have ways of querying gpu memory usage. I have a few command-line runs of gpuowl P-1 going on my RX480 as benchmarks versus widely spaced exponent values and tests of how high it can go. Presumably the new commit's -maxAlloc value should leave some space uncommitted, perhaps a GB, for other usage of the gpu ram. Thanks to you and George for your efforts, which also unavoidably result in a few bugs escaping. I confirmed by a quick test near the beginning that a halted P-1 run starts over from the beginning; there's no save file. With run time of more than a day on an RX480 for M300M, that's a drawback. |
[QUOTE=preda;493560]Yep, that means that the LLVM GCN backend does not know how to translate some operation to GCN. In this case it may be about a 128-bit SUB.
Are you using ROCm or amdgpu-pro? if ROCm, which version? If you're on most recent ROCm (i.e. 1.8.2), we should let them know ("ROCm issues") about it.[/QUOTE] I know this is a bit old but I have no other reference regarding this same problem but cna you somehow update the LLVM version that OpenCL uses to build the kernel in Windows while using the same OpenCL APP SDK 2.0?? |
well the problem happens when I try to use an uint64_t type in a kernel
|
[QUOTE=preda;525076]Hi, I just realized that P-1 stage2 in GpuOwl was broken since v6.5-51-gefc3c9f
P-1 should be fixed starting with v6.6 (pending more validation) If somebody had the bad luck of doing P-1 with an affected version, the stage2 part of the "no factor" results is not valid. (stage1 is good though). Any factor-found results are good, too. Independenly, I recently changed the memory allocation in stage2, which was a problem in the past (reported by Ken) (this was how I realized there's a bug). While this shows lack of testing on my part, it's also an advice to self-validate: please do a couple of P-1 on known results (that can be found in the folder test-pm1 in GpuOwl source) before starting serious P-1 work.[/QUOTE] George, any idea how many potentially bad P-1 results that need to be redone? |
[QUOTE=ixfd64;525176]George, any idea how many potentially bad P-1 results that need to be redone?[/QUOTE]
Madpoo found about 50. |
[QUOTE=Prime95;525185]Madpoo found about 50.[/QUOTE]
Perhaps he and uncwilly could put together a thread for cleaning that up. |
How much memory those P-1 tasks need. Can it be ran without having a GIMPS account? I can dedicate a couple of cores to the cause.
|
[QUOTE=PontiacGTX;525171]well the problem happens when I try to use an uint64_t type in a kernel[/QUOTE]
uint64_t is not a type in OpenCL. Use "unsigned long" instead, which is guaranteed (in OpenCL) to be 64bits. |
| All times are UTC. The time now is 23:15. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.