mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GpuOwl (https://www.mersenneforum.org/forumdisplay.php?f=171)
-   -   gpuOwL: an OpenCL program for Mersenne primality testing (https://www.mersenneforum.org/showthread.php?t=22204)

Fan Ming 2020-03-04 15:23

Any plan to add LL back? It would be quicker than CUDALucas thus really helpful in LL double check.

kriesel 2020-03-04 18:05

[QUOTE=Fan Ming;538866]Any plan to add LL back? It would be quicker than CUDALucas thus really helpful in LL double check.[/QUOTE]Preda mused about doing so a while back. One can run gpuowl v0.5 or v0.6 for 4M fft length LL DC on AMD in the meantime. I think wringing more performance out of the ffts is a good use of development time. Those gains should be applicable to LL also if/when it is reimplemented in guOwL.
[URL]https://www.mersenneforum.org/showpost.php?p=489083&postcount=7[/URL]
Prime95/mprime are also quite good at ll DC and have the Jacobi check, which CUDALucas lacks.

kriesel 2020-03-04 23:08

[QUOTE=Prime95;538829]Yes. The GPU is no longer recognized at boot. Tomorrow, I'll try moving the card to a different machine. If that machine does not recognize the card, I will have to RMA it.[/QUOTE]What rating power supply was driving this card that's now bad?
I had a GTX480 fail, a Quadro 2000 fail, two PCIe extender pads become questionable or dead, and an Asrock Radeon VII is now causing systems to fail to start, and bringing up Windows Startup Repair instead, all seen first on the same mining frame powered by a Rosewill Tokamak 1200W (platinum). RMA initiated on the Asrock, that lasted less than 3 weeks.

Prime95 2020-03-05 00:20

The rig has a 1000W power supply. Today I took the card over to another machine that has a speaker. I put it in as the only GPU. Powered on and and got one long, three short beeps -- bad GPU.

My guess is that it died in one of the power off / power on cycles in the upgrading / system reinstall process.

PhilF 2020-03-05 00:33

[QUOTE=Prime95;538760]The dmesg error on the bricked GPU is "Direct firmware load for amdgpu/vega20_ta.bin failed with error -2".[/QUOTE]

I think this message is very telling. An interruption during any firmware update can be (and usually is) fatal.

But that isn't your fault. I can't imagine that a failed firmware update that was silently forced upon you during an AMD software upgrade would void any warranty.

Prime95 2020-03-05 00:39

[QUOTE=PhilF;538910]I think this message is very telling. An interruption during any firmware update can be (and usually is) fatal.

But that isn't your fault. I can't imagine that a failed firmware update that was silently forced upon you during an AMD software upgrade would void any warranty.[/QUOTE]

I posted an update somewhere that the firmware error relates to the two working GPUs

preda 2020-03-06 11:11

[QUOTE=Prime95;538908]The rig has a 1000W power supply. Today I took the card over to another machine that has a speaker. I put it in as the only GPU. Powered on and and got one long, three short beeps -- bad GPU.

My guess is that it died in one of the power off / power on cycles in the upgrading / system reinstall process.[/QUOTE]

I had one R7 die similarly. I didn't exactly understand why, but it did happen when I was rebooting repeteadly while moving GPUs around. Luckily, it was under warranty and was not one of the GPUs that I "enhanced" by removing the logo, changing termal pad etc, so I was able to RMA it.

Fan Ming 2020-03-06 13:17

[QUOTE=kriesel;538885]Preda mused about doing so a while back. One can run gpuowl v0.5 or v0.6 for 4M fft length LL DC on AMD in the meantime. I think wringing more performance out of the ffts is a good use of development time. Those gains should be applicable to LL also if/when it is reimplemented in guOwL.
[URL]https://www.mersenneforum.org/showpost.php?p=489083&postcount=7[/URL]
Prime95/mprime are also quite good at ll DC and have the Jacobi check, which CUDALucas lacks.[/QUOTE]

I know it's not hard to modify the code of gpuowl to do LL tests. I've already modified a LL version of a relative new version of gpuowl (not newest, without Jacobi check, however, if running on Google colab, the result is reliable and thus not a significant problem), and reproduced the LL residue of M1000003 successfully. I'm running a real DC ~50M now. However, the program works well only merged middle was [B]not[/B] used. Once the merged middle option was used, it gave wrong results. IDK what happened since I haven't learn the detail of merged middle, and I also want an official (new) gpuowl with LL test.

Fan Ming 2020-03-06 14:10

[QUOTE=Fan Ming;539006]I know it's not hard to modify the code of gpuowl to do LL tests. I've already modified a LL version of a relative new version of gpuowl (not newest, without Jacobi check, however, if running on Google colab, the result is reliable and thus not a significant problem), and reproduced the LL residue of M1000003 successfully. I'm running a real DC ~50M now. However, the program works well only merged middle was [B]not[/B] used. Once the merged middle option was used, it gave wrong results. IDK what happened since I haven't learn the detail of merged middle, and I also want an official (new) gpuowl with LL test.[/QUOTE]

Reason found. When Middle = 1 the program will not use middle(useMiddle == false and useMergedMiddle == false), however, the macro MERGED_MIDDLE still defined in the OpenCL program. Don't know if the newest commit fixed this problem.
Now the right result can be produced when using merged middle. Significant faster than CUDALucas.

paulunderwood 2020-03-06 17:11

kworker hell fix
 
After running gpuOwl for a week or so Linux "kworker" creeps up and uses nearly a full core which detracts from LLR crunching on the CPU. A simple fix is stop gpuOwl and immediately resume it. :cool:

kriesel 2020-03-06 19:02

[QUOTE=Fan Ming;539006]I know it's not hard to modify the code of gpuowl to do LL tests. I've already modified a LL version of a relative new version of gpuowl (not newest, without Jacobi check[/QUOTE]Mihai showed a way to add the Jacobi check back at V0.6 Addition of Jacobi check to LL flavor of gpuOwL [URL="http://www.mersenneforum.org/showpost.php?p=465145&postcount=46"]http://www.mersenneforum.org/showpos...5&postcount=46[/URL] 2017-08-08


All times are UTC. The time now is 23:11.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.