![]() |
![]() |
#1838 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
1CC616 Posts |
![]()
All 0 residues in P-1 stage 1, and it marches blindly on.
Worktodo line: Code:
B1=1040000,B2=28080000;PFactor=0,1,2,99998441,-1,77,2 Code:
-device 0 -user kriesel -cpu condorella/rx480 -use NO_ASM,UNROLL_HEIGHT,UNROLL_WIDTH,MERGED_MIDDLE,WORKINGIN1,WORKINGOUT1,T2_SHUFFLE_HEIGHT,T2_SHUFFLE_WIDTH,CARRY32,MORE_SQUARES_MIDDLEMUL1,CHEBYSHEV_MIDDLEMUL2,NEW_SLOWTRIG Code:
C:\msys64\home\ken\gpuowl-compile\gpuowl-v6.11-134-g1e0ce1d\rx480>gpuowl-win 2020-02-11 16:35:15 gpuowl v6.11-134-g1e0ce1d 2020-02-11 16:35:15 config: -device 0 -user kriesel -cpu condorella/rx480 -use NO_ASM,UNROLL_HEIGHT,UNROLL_WIDTH,MERGED_MIDDLE,WORKINGIN1,WORKINGOUT1,T2_SHUFFLE _HEIGHT,T2_SHUFFLE_WIDTH,CARRY32,MORE_SQUARES_MIDDLEMUL1,CHEBYSHEV_MIDDLEMUL2,NEW_SLOWTRIG 2020-02-11 16:35:15 config: 2020-02-11 16:35:15 config: 4.5m fft NO_ASM,MERGED_MIDDLE,WORKINGIN5,WORKINGOUT1,T2_SHUFFLE_WIDTH,T2_SHUFFLE_HEIGHT,UNROLL_MIDDLEMUL2,UNROLL_MIDDLEMUL1,CARRY32, CHEBYSHEV_METHOD_FMA,CHEBYSHEV_MIDDLEMUL2,LESS_ACCURATE 2020-02-11 16:35:15 config: :5m fft NO_ASM,UNROLL_HEIGHT,UNROLL_WIDTH,MERGED_MIDDLE,WORKINGIN1,WORKINGOUT1,T2_SHUFFLE_HEIGHT,T2_SHUFFLE_WIDTH,CARRY32,MORE_SQUA RES_MIDDLEMUL1,CHEBYSHEV_MIDDLEMUL2,NEW_SLOWTRIG 2020-02-11 16:35:15 device 0, unique id '' 2020-02-11 16:35:15 condorella/rx480 99998441 FFT 5632K: Width 256x4, Height 64x4, Middle 11; 17.34 bits/word 2020-02-11 16:35:17 condorella/rx480 OpenCL args "-DEXP=99998441u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=11u -DWEIGHT_STEP=0xc.a5a9d5baf7a18p-3 -DIWEIGHT_ST EP=0xa.1ef1eeb123f08p-4 -DWEIGHT_BIGSTEP=0xd.744fccad69d68p-3 -DIWEIGHT_BIGSTEP=0x9.837f0518db8a8p-4 -DAMDGPU=1 -DCARRY32=1 -DCHEBYSHEV_MIDDLEMUL2=1 -DMERGED_MI DDLE=1 -DMORE_SQUARES_MIDDLEMUL1=1 -DNEW_SLOWTRIG=1 -DNO_ASM=1 -DT2_SHUFFLE_HEIGHT=1 -DT2_SHUFFLE_WIDTH=1 -DUNROLL_HEIGHT=1 -DUNROLL_WIDTH=1 -DWORKINGIN1=1 -DWO RKINGOUT1=1 -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2020-02-11 16:35:20 condorella/rx480 OpenCL compilation in 3.11 s 2020-02-11 16:35:20 condorella/rx480 99998441 P1 B1=1040000, B2=28080000; 1500153 bits; starting at 0 2020-02-11 16:35:58 condorella/rx480 99998441 P1 10000 0.67%; 3761 us/it; ETA 0d 01:33; 0000000000000000 2020-02-11 16:36:36 condorella/rx480 99998441 P1 20000 1.33%; 3768 us/it; ETA 0d 01:33; 0000000000000000 2020-02-11 16:37:13 condorella/rx480 99998441 P1 30000 2.00%; 3767 us/it; ETA 0d 01:32; 0000000000000000 2020-02-11 16:37:51 condorella/rx480 99998441 P1 40000 2.67%; 3769 us/it; ETA 0d 01:32; 0000000000000000 2020-02-11 16:38:29 condorella/rx480 99998441 P1 50000 3.33%; 3767 us/it; ETA 0d 01:31; 0000000000000000 2020-02-11 16:39:06 condorella/rx480 99998441 P1 60000 4.00%; 3771 us/it; ETA 0d 01:31; 0000000000000000 2020-02-11 16:39:44 condorella/rx480 99998441 P1 70000 4.67%; 3764 us/it; ETA 0d 01:30; 0000000000000000 2020-02-11 16:40:21 condorella/rx480 saved 2020-02-11 16:40:22 condorella/rx480 99998441 P1 80000 5.33%; 3781 us/it; ETA 0d 01:30; 0000000000000000 2020-02-11 16:41:00 condorella/rx480 99998441 P1 90000 6.00%; 3768 us/it; ETA 0d 01:29; 0000000000000000 2020-02-11 16:41:37 condorella/rx480 99998441 P1 100000 6.67%; 3768 us/it; ETA 0d 01:28; 0000000000000000 2020-02-11 16:42:15 condorella/rx480 99998441 P1 110000 7.33%; 3765 us/it; ETA 0d 01:27; 0000000000000000 2020-02-11 16:42:53 condorella/rx480 99998441 P1 120000 8.00%; 3768 us/it; ETA 0d 01:27; 0000000000000000 2020-02-11 16:43:30 condorella/rx480 99998441 P1 130000 8.67%; 3763 us/it; ETA 0d 01:26; 0000000000000000 2020-02-11 16:44:08 condorella/rx480 99998441 P1 140000 9.33%; 3770 us/it; ETA 0d 01:25; 0000000000000000 2020-02-11 16:44:46 condorella/rx480 99998441 P1 150000 10.00%; 3766 us/it; ETA 0d 01:25; 0000000000000000 2020-02-11 16:45:21 condorella/rx480 saved 2020-02-11 16:45:23 condorella/rx480 99998441 P1 160000 10.67%; 3784 us/it; ETA 0d 01:25; 0000000000000000 2020-02-11 16:46:01 condorella/rx480 99998441 P1 170000 11.33%; 3769 us/it; ETA 0d 01:24; 0000000000000000 2020-02-11 16:46:39 condorella/rx480 99998441 P1 180000 12.00%; 3765 us/it; ETA 0d 01:23; 0000000000000000 2020-02-11 16:47:16 condorella/rx480 99998441 P1 190000 12.67%; 3771 us/it; ETA 0d 01:22; 0000000000000000 2020-02-11 16:47:54 condorella/rx480 99998441 P1 200000 13.33%; 3767 us/it; ETA 0d 01:22; 0000000000000000 2020-02-11 16:48:32 condorella/rx480 99998441 P1 210000 14.00%; 3769 us/it; ETA 0d 01:21; 0000000000000000 2020-02-11 16:49:10 condorella/rx480 99998441 P1 220000 14.67%; 3768 us/it; ETA 0d 01:20; 0000000000000000 2020-02-11 16:49:47 condorella/rx480 99998441 P1 230000 15.33%; 3769 us/it; ETA 0d 01:20; 0000000000000000 2020-02-11 16:50:22 condorella/rx480 saved 2020-02-11 16:50:25 condorella/rx480 99998441 P1 240000 16.00%; 3779 us/it; ETA 0d 01:19; 0000000000000000 2020-02-11 16:51:03 condorella/rx480 99998441 P1 250000 16.66%; 3764 us/it; ETA 0d 01:18; 0000000000000000 2020-02-11 16:51:40 condorella/rx480 99998441 P1 260000 17.33%; 3767 us/it; ETA 0d 01:18; 0000000000000000 2020-02-11 16:52:18 condorella/rx480 99998441 P1 270000 18.00%; 3766 us/it; ETA 0d 01:17; 0000000000000000 2020-02-11 16:52:56 condorella/rx480 99998441 P1 280000 18.66%; 3770 us/it; ETA 0d 01:17; 0000000000000000 2020-02-11 16:53:33 condorella/rx480 99998441 P1 290000 19.33%; 3769 us/it; ETA 0d 01:16; 0000000000000000 2020-02-11 16:54:11 condorella/rx480 99998441 P1 300000 20.00%; 3769 us/it; ETA 0d 01:15; 0000000000000000 2020-02-11 16:54:49 condorella/rx480 99998441 P1 310000 20.66%; 3764 us/it; ETA 0d 01:15; 0000000000000000 2020-02-11 16:55:22 condorella/rx480 saved 2020-02-11 16:55:27 condorella/rx480 99998441 P1 320000 21.33%; 3780 us/it; ETA 0d 01:14; 0000000000000000 2020-02-11 16:56:04 condorella/rx480 99998441 P1 330000 22.00%; 3765 us/it; ETA 0d 01:13; 0000000000000000 2020-02-11 16:56:42 condorella/rx480 99998441 P1 340000 22.66%; 3759 us/it; ETA 0d 01:13; 0000000000000000 2020-02-11 16:57:19 condorella/rx480 99998441 P1 350000 23.33%; 3770 us/it; ETA 0d 01:12; 0000000000000000 2020-02-11 16:57:57 condorella/rx480 99998441 P1 360000 24.00%; 3767 us/it; ETA 0d 01:12; 0000000000000000 2020-02-11 16:58:35 condorella/rx480 99998441 P1 370000 24.66%; 3767 us/it; ETA 0d 01:11; 0000000000000000 Code:
C:\msys64\home\ken\gpuowl-compile\gpuowl-v6.11-134-g1e0ce1d\rx480>gpuowl-win 2020-02-11 17:07:30 gpuowl v6.11-134-g1e0ce1d 2020-02-11 17:07:30 config: -device 0 -user kriesel -cpu condorella/rx480 -use NO_ASM 2020-02-11 17:07:30 config: 2020-02-11 17:07:30 config: :,UNROLL_HEIGHT,UNROLL_WIDTH,MERGED_MIDDLE,WORKINGIN1,WORKINGOUT1,T2_SHUFFLE_HEIGHT,T2_SHUFFLE_WIDTH,CARRY32,MORE_SQUARES_MIDDLEMUL1 ,CHEBYSHEV_MIDDLEMUL2,NEW_SLOWTRIG 2020-02-11 17:07:30 config: 2020-02-11 17:07:30 config: :4.5m fft NO_ASM,MERGED_MIDDLE,WORKINGIN5,WORKINGOUT1,T2_SHUFFLE_WIDTH,T2_SHUFFLE_HEIGHT,UNROLL_MIDDLEMUL2,UNROLL_MIDDLEMUL1,CARRY32 ,CHEBYSHEV_METHOD_FMA,CHEBYSHEV_MIDDLEMUL2,LESS_ACCURATE 2020-02-11 17:07:30 config: :5m fft NO_ASM,UNROLL_HEIGHT,UNROLL_WIDTH,MERGED_MIDDLE,WORKINGIN1,WORKINGOUT1,T2_SHUFFLE_HEIGHT,T2_SHUFFLE_WIDTH,CARRY32,MORE_SQUA RES_MIDDLEMUL1,CHEBYSHEV_MIDDLEMUL2,NEW_SLOWTRIG 2020-02-11 17:07:30 device 0, unique id '' 2020-02-11 17:07:30 condorella/rx480 99998441 FFT 5632K: Width 256x4, Height 64x4, Middle 11; 17.34 bits/word 2020-02-11 17:07:32 condorella/rx480 OpenCL args "-DEXP=99998441u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=11u -DWEIGHT_STEP=0xc.a5a9d5baf7a18p-3 -DIWEIGHT_ST EP=0xa.1ef1eeb123f08p-4 -DWEIGHT_BIGSTEP=0xd.744fccad69d68p-3 -DIWEIGHT_BIGSTEP=0x9.837f0518db8a8p-4 -DAMDGPU=1 -DNO_ASM=1 -I. -cl-fast-relaxed-math -cl-std=CL 2.0" 2020-02-11 17:07:35 condorella/rx480 OpenCL compilation in 3.20 s 2020-02-11 17:07:36 condorella/rx480 99998441 P1 B1=1040000, B2=28080000; 1500153 bits; starting at 0 2020-02-11 17:08:14 condorella/rx480 99998441 P1 10000 0.67%; 3887 us/it; ETA 0d 01:37; 5412ff3dd7337b62 2020-02-11 17:08:53 condorella/rx480 99998441 P1 20000 1.33%; 3897 us/it; ETA 0d 01:36; 67401cf04590fe9e Last fiddled with by kriesel on 2020-02-11 at 23:28 |
![]() |
![]() |
![]() |
#1839 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2·29·127 Posts |
![]()
Latest gpuowl commit is still missing the 15M. This was on Google Colaboratory, for a hopefully very reliable gpu and underlying system.
Code:
{"exponent":"2000081", "worktype":"PM1", "status":"F", "program":{"name":"gpuowl", "version":"v6.11-145-g6146b6d-dirty"}, "timestamp":"2020-02-12 20:35:23 UTC", "user":"kriesel", "computer":"colab3/TeslaP4", "fft-length":131072, "B1":15015, "factors":["2700109974025273"]} {"exponent":"4444091", "worktype":"PM1", "status":"F", "program":{"name":"gpuowl", "version":"v6.11-145-g6146b6d-dirty"}, "timestamp":"2020-02-12 20:35:47 UTC", "user":"kriesel", "computer":"colab3/TeslaP4", "fft-length":229376, "B1":15015, "factors":["1809798096458971047321927127"]} {"exponent":"10000831", "worktype":"PM1", "status":"F", "program":{"name":"gpuowl", "version":"v6.11-145-g6146b6d-dirty"}, "timestamp":"2020-02-12 20:40:05 UTC", "user":"kriesel", "computer":"colab3/TeslaP4", "fft-length":524288, "B1":120000, "B2":2200000, "factors":["646560662529991467527"]} {"exponent":"15000031", "worktype":"PM1", "status":"NF", "program":{"name":"gpuowl", "version":"v6.11-145-g6146b6d-dirty"}, "timestamp":"2020-02-12 20:54:51 UTC", "user":"kriesel", "computer":"colab3/TeslaP4", "fft-length":786432, "B1":180000, "B2":3780000} {"exponent":"18000137", "worktype":"PM1", "status":"F", "program":{"name":"gpuowl", "version":"v6.11-145-g6146b6d-dirty"}, "timestamp":"2020-02-12 20:55:29 UTC", "user":"kriesel", "computer":"colab3/TeslaP4", "fft-length":1048576, "B1":15015, "factors":["2479169845866581244380961527"]} {"exponent":"19000013", "worktype":"PM1", "status":"F", "program":{"name":"gpuowl", "version":"v6.11-145-g6146b6d-dirty"}, "timestamp":"2020-02-12 20:56:14 UTC", "user":"kriesel", "computer":"colab3/TeslaP4", "fft-length":1048576, "B1":15015, "factors":["4674003199"]} Code:
CUDAPm1 v0.20 ------- DEVICE 0 ------- name GeForce GTX 1080 Compatibility 6.1 clockRate (MHz) 1797 memClockRate (MHz) 5005 totalGlobalMem zu totalConstMem zu l2CacheSize 2097152 sharedMemPerBlock zu regsPerBlock 65536 warpSize 32 memPitch zu maxThreadsPerBlock 1024 maxThreadsPerMP 2048 multiProcessorCount 20 maxThreadsDim[3] 1024,1024,64 maxGridSize[3] 2147483647,65535,65535 textureAlignment zu deviceOverlap 1 CUDA reports 7968M of 8192M GPU memory free. Using threads: norm1 256, mult 256, norm2 1024. No stage 2 checkpoint. Using up to 4992M GPU memory. Selected B1=2360000, B2=59590000, 3.83% chance of finding a factor Using B1 = 2360000 from savefile. Continuing stage 2 from a partial result of M282000073 fft length = 16384K batch wrapper reports (re)launch at Wed 02/12/2020 15:13:40.06 reset count 0 of max 3 CUDAPm1 v0.20 ------- DEVICE 0 ------- name GeForce GTX 1080 Compatibility 6.1 clockRate (MHz) 1797 memClockRate (MHz) 5005 totalGlobalMem zu totalConstMem zu l2CacheSize 2097152 sharedMemPerBlock zu regsPerBlock 65536 warpSize 32 memPitch zu maxThreadsPerBlock 1024 maxThreadsPerMP 2048 multiProcessorCount 20 maxThreadsDim[3] 1024,1024,64 maxGridSize[3] 2147483647,65535,65535 textureAlignment zu deviceOverlap 1 CUDA reports 7968M of 8192M GPU memory free. Index 25 Using threads: norm1 256, mult 32, norm2 64. Using up to 4137M GPU memory. Selected B1=275000, B2=8112500, 5.53% chance of finding a factor Starting stage 1 P-1, M15000031, B1 = 275000, B2 = 8112500, fft length = 800K Doing 396818 iterations Iteration 100000 M15000031, 0x7a8e085ca931e223, n = 800K, CUDAPm1 v0.20 err = 0.14453 (1:19 real, 0.7873 ms/iter, ETA 3:53) Iteration 200000 M15000031, 0xad072f2e5fc4eb76, n = 800K, CUDAPm1 v0.20 err = 0.14844 (1:19 real, 0.7901 ms/iter, ETA 2:35) Iteration 300000 M15000031, 0x82162c462572c64d, n = 800K, CUDAPm1 v0.20 err = 0.14063 (1:19 real, 0.7920 ms/iter, ETA 1:16) M15000031, 0xe80933bd37a9f9a9, n = 800K, CUDAPm1 v0.20 Stage 1 complete, estimated total time = 5:14 Starting stage 1 gcd. M15000031 Stage 1 found no factor (P-1, B1=275000, B2=8112500, e=0, n=800K CUDAPm1 v0.20) Starting stage 2. Using b1 = 275000, b2 = 8112500, d = 2310, e = 12, nrp = 480 Zeros: 348644, Ones: 429436, Pairs: 93347 Processing 1 - 480 of 480 relative primes. Inititalizing pass... done. transforms: 31221, err = 0.14453, (13.64 real, 0.4369 ms/tran, ETA NA) Transforms: 229552 M15000031, 0x6760f107920d3922, n = 800K, CUDAPm1 v0.20 err = 0.14844 (1:35 real, 0.4140 ms/tran, ETA 4:36) Transforms: 218518 M15000031, 0x45ab02ca00c98138, n = 800K, CUDAPm1 v0.20 err = 0.14063 (1:31 real, 0.4142 ms/tran, ETA 3:06) Transforms: 214190 M15000031, 0x1f7fed9dfd61de18, n = 800K, CUDAPm1 v0.20 err = 0.14844 (1:28 real, 0.4145 ms/tran, ETA 1:37) Transforms: 235492 M15000031, 0xbff7fea3340e621f, n = 800K, CUDAPm1 v0.20 err = 0.15625 (1:38 real, 0.4160 ms/tran, ETA 0:00) Stage 2 complete, 928973 transforms, estimated total time = 6:25 Starting stage 2 gcd. M15000031 has a factor: 1178543237739460982839 (P-1, B1=275000, B2=8112500, e=12, n=800K CUDAPm1 v0.20) Code:
[Wed Feb 12 15:47:41 2020] P-1 found a factor in stage #2, B1=255000, B2=5737500, E=12. UID: Kriesel/peregrine, M15000031 has a factor: 1178543237739460982839 (P-1, B1=255000, B2=5737500, E=12) |
![]() |
![]() |
![]() |
#1840 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2·29·127 Posts |
![]()
Here it is, tested only as far as the help output.
|
![]() |
![]() |
![]() |
#1841 |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
1CC616 Posts |
![]()
It ran below 20M P-1 with -maxAlloc 7500, and crashes on 20M.
Code:
2020-02-12 22:15:08 gpuowl v6.11-145-g6146b6d-dirty 2020-02-12 22:15:09 config: -user kriesel -cpu colab3/TeslaP4 -yield -maxAlloc 7000 -use NO_ASM 2020-02-12 22:15:09 device 0, unique id '' 2020-02-12 22:15:09 colab3/TeslaP4 20000023 FFT 1152K: Width 8x8, Height 256x4, Middle 9; 16.95 bits/word 2020-02-12 22:15:09 colab3/TeslaP4 OpenCL args "-DEXP=20000023u -DWIDTH=64u -DSMALL_HEIGHT=1024u -DMIDDLE=9u -DWEIGHT_STEP=0x1.0840814dcafb8p+0 -DIWEIGHT_STEP=0x1.f002ed51e880ap-1 -DWEIGHT_BIGSTEP=0x1.172b83c7d517bp+0 -DIWEIGHT_BIGSTEP=0x1.d5818dcfba487p-1 -DNO_ASM=1 -I. -cl-fast-relaxed-math -cl-std=CL2.0" 2020-02-12 22:15:09 colab3/TeslaP4 2020-02-12 22:15:09 colab3/TeslaP4 OpenCL compilation in 0.01 s 2020-02-12 22:15:09 colab3/TeslaP4 20000023 P1 B1=240000, B2=5760000; 346123 bits; starting at 346122 2020-02-12 22:15:09 colab3/TeslaP4 20000023 P1 346123 100.00%; 43626 us/it; ETA 0d 00:00; 6a2e08e14df5900e 2020-02-12 22:15:09 colab3/TeslaP4 P-1 (B1=240000, B2=5760000, D=30030): primes 376241, expanded 390008, doubles 69210 (left 241927), singles 237821, total 307031 (82%) 2020-02-12 22:15:09 colab3/TeslaP4 20000023 P2 using blocks [8 - 192] to cover 307031 primes 2020-02-12 22:15:09 colab3/TeslaP4 20000023 P2 using 759 buffers of 9.0 MB each 2020-02-12 22:15:21 colab3/TeslaP4 Exception gpu_error: MEM_OBJECT_ALLOCATION_FAILURE clEnqueueCopyBuffer(queue, src, dst, 0, 0, size, 0, NULL, NULL) at clwrap.cpp:344 copyBuf 2020-02-12 22:15:21 colab3/TeslaP4 Bye |
![]() |
![]() |
![]() |
#1842 | |
"Mihai Preda"
Apr 2015
5·172 Posts |
![]()
Try with a smaller -maxAlloc. Can you check the free memory on the GPU -- how much is free before and during the gpuowl run?
Quote:
|
|
![]() |
![]() |
![]() |
#1843 |
"Mihai Preda"
Apr 2015
5A516 Posts |
![]()
Did you try with -use ORIG_SLOWTRIG
Last fiddled with by preda on 2020-02-13 at 07:34 |
![]() |
![]() |
![]() |
#1844 | |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
2×29×127 Posts |
![]() Quote:
T4 0/15079 MiB P100 0/16280 MiB P4 ? K80 ? To get those last two also is a matter of waiting to hit them in the Colab gpu model lottery. Models are listed in probability order, most frequent recently first. I've added logging idle and active nvidia-smi output to Google drive files into the Colab script. Colab "screen" output to the browser is lost when a new session is launched, the page closed, or the data scrolls out of the 5000 line buffer. Based on my recent experience with my first Colab accounts it could take ~5 weeks to get all the 4 models allocated. Perhaps it will be quicker on this newer account. Last fiddled with by kriesel on 2020-02-13 at 09:21 |
|
![]() |
![]() |
![]() |
#1845 |
"Mihai Preda"
Apr 2015
5A516 Posts |
![]()
OK I checked myself, the problem with M15000031 is that by default it gets a too small FFT size for P-1 (it's at the border). If FFT size is manually increased the factor is found. I'll keep an eye on improving the default FFT size.
|
![]() |
![]() |
![]() |
#1846 | |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
1CC616 Posts |
![]() Quote:
GPU model Idle Active T4 0/15079 MiB 5939 gpuowl P100 0/16280 MiB 293 mfaktc P4 ? ? K80 ? ? To get those last two also is a matter of waiting to hit them in the Colab gpu model lottery. Models are listed in probability order, most frequent recently first. I've added logging idle and active nvidia-smi output to Google drive files into the Colab script. Colab "screen" output to the browser is lost when a new session is launched, the page closed, or the data scrolls out of the 5000 line buffer. Based on my recent experience with my first Colab accounts it could take ~5 weeks to get all the 4 models allocated. Perhaps it will be quicker on this newer account. (Moderator please delete my previous similar post and this line; this post replaces the previous post.) Last fiddled with by kriesel on 2020-02-13 at 09:30 |
|
![]() |
![]() |
![]() |
#1847 | |
"Mihai Preda"
Apr 2015
5·172 Posts |
![]()
Ken, I understand it's not easy to get all this information, and maybe it's not even needed. The situation is that I'm not yet convinced that there is a problem with the way GpuOwl handles maxAlloc or buffer allocation. I'm not convinced because I can imagine alternative explanations for the observed behavior. The alternative explanation is: maybe the GPU, even if it is reporting 8GB, does not have all of that actually available. Maybe it has less than 7.5 actually available to be allocated in contigous blocks of 9MB (for some reason). Thus, GpuOwl will fail if ran with maxAlloc 7.5G, but that's not necessarilly a bug of the program.
(all that because OpenCL does not offer a normal/reliable way to query actual free GPU memory) Quote:
Last fiddled with by preda on 2020-02-13 at 10:24 |
|
![]() |
![]() |
![]() |
#1848 | |
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest
163068 Posts |
![]() Quote:
I have two Colab accounts "instrumented" now to catch idle and activated nvidia-smi output which includes allocated & total MiB gpu ram. If they're not too buggy script additions. I agree that it seems unlikely it's a memAlloc problem at 20M test exponent; a smaller exponent succeeded on the P4 with 7272MB allocated, 7500 maxAlloc, while the 20M failed with memAlloc set at7500, 7300, and 7000. But we'll see. FYI, gtx1080, gpuowl v6.11-134 P-1, stage 2, 443M exponent, 18 buffers x 224MB, nvidia-smi shows 7527/8192MiB active, vs 107 idle. Last fiddled with by kriesel on 2020-02-13 at 11:07 |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1719 | 2023-01-16 15:51 |
GPUOWL AMD Windows OpenCL issues | xx005fs | GpuOwl | 0 | 2019-07-26 21:37 |
Testing an expression for primality | 1260 | Software | 17 | 2015-08-28 01:35 |
Testing Mersenne cofactors for primality? | CRGreathouse | Computer Science & Computational Number Theory | 18 | 2013-06-08 19:12 |
Primality-testing program with multiple types of moduli (PFGW-related) | Unregistered | Information & Answers | 4 | 2006-10-04 22:38 |