![]() |
|
|
#2630 |
|
"Ethan O'Connor"
Oct 2002
GIMPS since Jan 1996
22×23 Posts |
p102-100.
- core count is 3200 vs 3584.Code:
.\gpuowl-win.exe -device 1 -time -prp 57885161 2020-12-17 11:29:09 GpuOwl VERSION v7.2-21-g28dbf88 2020-12-17 11:29:14 P102-100-1 57885161 OK 800 0.00% 5727fe6a7225c273 2026 us/it + check 0.85s + save 0.07s; ETA 1d 08:35 2020-12-17 11:29:33 P102-100-1 57885161 10000 0.02% 91565f36715e33e3 2031 us/it 2020-12-17 11:29:54 P102-100-1 57885161 20000 0.03% f2c610087d02c3ea 2043 us/it 2020-12-17 11:30:14 P102-100-1 57885161 30000 0.05% fe1565094c7f7b47 2057 us/it 2020-12-17 11:30:19 P102-100-1 57885161 Stopping, please wait.. 2020-12-17 11:30:20 P102-100-1 57885161 OK 32400 0.06% ce572a2ae80045f5 2057 us/it + check 0.86s + save 0.08s; ETA 1d 09:04 2020-12-17 11:30:20 P102-100-1 57885161 38.61% carryFused : 784 us/call x 31197 calls 2020-12-17 11:30:20 P102-100-1 57885161 34.01% tailFusedSquare : 691 us/call x 31200 calls 2020-12-17 11:30:20 P102-100-1 57885161 13.61% fftMiddleOut : 276 us/call x 31277 calls 2020-12-17 11:30:20 P102-100-1 57885161 13.54% fftMiddleIn : 274 us/call x 31281 calls 2020-12-17 11:30:20 P102-100-1 57885161 0.12% tailFusedMul : 970 us/call x 77 calls 2020-12-17 11:30:20 P102-100-1 57885161 0.05% fftP : 386 us/call x 84 calls 2020-12-17 11:30:20 P102-100-1 57885161 0.04% fftW : 340 us/call x 80 calls 2020-12-17 11:30:20 P102-100-1 57885161 0.01% carryA : 110 us/call x 80 calls Last fiddled with by Ethan (EO) on 2020-12-17 at 19:53 |
|
|
|
|
|
#2631 | |
|
Jul 2009
Germany
607 Posts |
Quote:
|
|
|
|
|
|
|
#2632 | |
|
"Ethan O'Connor"
Oct 2002
GIMPS since Jan 1996
9210 Posts |
Quote:
Code:
2020-12-17 12:16:51 P102-100-1 77936867 10000 0.01% fc4f135f7cf4ad29 2718 us/it 2020-12-17 12:17:18 P102-100-1 77936867 20000 0.03% 3cd1bd9d5e09cbc5 2737 us/it 2020-12-17 12:17:46 P102-100-1 77936867 30000 0.04% c4e0ff35e3290d98 2736 us/it 2736/2473 = 1.11 3584/3200 = 1.12 352/320 = 1.1 |
|
|
|
|
|
|
#2633 |
|
Jul 2009
Germany
607 Posts |
Thx.
Here is another benchmark value for James Heinrich for the Nvidia Tesla K-80 (best value of 5 different cards) with gpuowl 6.11.380 Ubuntu Code:
2020-12-17 20:15:57 config: -carry short -use CARRY32,ORIG_SLOWTRIG,IN_WG=128,IN_SIZEX=16,IN_SPACING=4,OUT_WG=128,OUT_SIZEX=16,OUT_SPACING=4 -nospin -block 100 -maxAlloc 10000 -B1 750000 -rB2 20 -prp 57885161 2020-12-17 20:15:57 device 0, unique id '' 2020-12-17 20:15:57 Tesla K80-0 57885161 FFT: 3M 1K:6:256 (18.40 bpw) 2020-12-17 20:15:57 Tesla K80-0 Expected maximum carry32: 42500000 2020-12-17 20:15:58 Tesla K80-0 OpenCL args "-DEXP=57885161u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=6u -DPM1=0 -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0x1.07673850f37p-1 -DIWEIGHT_STEP_MINUS_1=-0x1.5bd9e39e14a3dp-2 -DCARRY32=1 -DIN_SIZEX=16 -DIN_SPACING=4 -DIN_WG=128 -DORIG_SLOWTRIG=1 -DOUT_SIZEX=16 -DOUT_SPACING=4 -DOUT_WG=128 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only " 2020-12-17 20:17:10 Note: not found 'config.txt' 2020-12-17 20:17:10 config: -carry short -use CARRY32,ORIG_SLOWTRIG,IN_WG=128,IN_SIZEX=16,IN_SPACING=4,OUT_WG=128,OUT_SIZEX=16,OUT_SPACING=4 -nospin -block 100 -maxAlloc 10000 -B1 750000 -rB2 20 -prp 57885161 2020-12-17 20:17:10 device 0, unique id '' 2020-12-17 20:17:10 Tesla K80-0 57885161 FFT: 3M 1K:6:256 (18.40 bpw) 2020-12-17 20:17:10 Tesla K80-0 Expected maximum carry32: 42500000 2020-12-17 20:17:10 Tesla K80-0 OpenCL args "-DEXP=57885161u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=6u -DPM1=0 -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0x1.07673850f37p-1 -DIWEIGHT_STEP_MINUS_1=-0x1.5bd9e39e14a3dp-2 -DCARRY32=1 -DIN_SIZEX=16 -DIN_SPACING=4 -DIN_WG=128 -DORIG_SLOWTRIG=1 -DOUT_SIZEX=16 -DOUT_SPACING=4 -DOUT_WG=128 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only " 2020-12-17 20:17:13 Tesla K80-0 2020-12-17 20:17:13 Tesla K80-0 OpenCL compilation in 2.66 s 2020-12-17 20:17:14 Tesla K80-0 57885161 OK 0 loaded: blockSize 100, 0000000000000003 2020-12-17 20:17:14 Tesla K80-0 validating proof residues for power 8 2020-12-17 20:17:14 Tesla K80-0 Proof using power 8 2020-12-17 20:17:14 Tesla K80-0 57885161 OK 200 0.00%; 2099 us/it; ETA 1d 09:45; 08e8268acbd436a3 (check 0.31s) 2020-12-17 20:24:02 Tesla K80-0 57885161 OK 200000 0.35%; 2038 us/it; ETA 1d 08:39; de62d6db1ad5092d (check 0.30s) 2020-12-17 20:30:51 Tesla K80-0 57885161 OK 400000 0.69%; 2043 us/it; ETA 1d 08:38; 45e043b36f3556e1 (check 0.31s) 2020-12-17 20:32:04 Tesla K80-0 Stopping, please wait.. 2020-12-17 20:32:04 Tesla K80-0 57885161 OK 435800 0.75%; 2042 us/it; ETA 1d 08:35; 22c6f0efd61bd8b9 (check 0.31s) 2020-12-17 20:32:04 Tesla K80-0 Exiting because "stop requested" 2020-12-17 20:32:04 Tesla K80-0 Bye Last fiddled with by moebius on 2020-12-17 at 21:26 |
|
|
|
|
|
#2634 |
|
Jul 2009
Germany
60710 Posts |
Is gpuowl able to calculate together on the two GK210 chips of the K80, or should two instances have roughly the same throughput? Each chip seems to be able to use 12 GB of memory! The above benchmarks are for one instance.
https://wccftech.com/nvidia-tesla-k8...ision-compute/ Last fiddled with by moebius on 2020-12-17 at 22:19 |
|
|
|
|
|
#2635 | |
|
"mrh"
Oct 2018
Temecula, ca
22×3×5 Posts |
Quote:
-mike Last fiddled with by mrh on 2020-12-17 at 22:54 |
|
|
|
|
|
|
#2636 |
|
Jul 2009
Germany
11378 Posts |
I adopted the exponent from another Mike https://mersenneforum.org/showpost.p...postcount=2498
This is for my own guowl benchmark list, https://mersenneforum.org/showpost.p...postcount=2603 because I think that CudaLucas and gpuowl values cannot necessarily be compared. The exponent lies roughly between the DC and first-time test wavefront. |
|
|
|
|
|
#2637 | |
|
"mrh"
Oct 2018
Temecula, ca
1111002 Posts |
Quote:
Code:
2020-12-17 17:00:22 GpuOwl VERSION v7.2-16-g1a50f11-dirty
2020-12-17 17:00:22 GpuOwl VERSION v7.2-16-g1a50f11-dirty
2020-12-17 17:00:22 config: -maxAlloc 8G
2020-12-17 17:00:22 config: -prp 77936867
2020-12-17 17:00:22 device 0, unique id ''
2020-12-17 17:00:22 GeForce GTX 1070 Ti-0 77936867 FFT: 4M 1K:8:256 (18.58 bpw)
2020-12-17 17:00:22 GeForce GTX 1070 Ti-0 77936867 OpenCL args "-DEXP=77936867u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=8u -DMM_CHAIN=1u -DMM2_CHAIN=2u -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0.33644726404543274 -DIWEIGHT_STEP_MINUS_1=-0.25174750481886216 -DIWEIGHTS={0,-0.25174750481886216,-0.44011820345520131,-0.16213409745771243,-0.37306474779553728,-0.061788266441989627,-0.29798072935699788,-0.47471232907613115,-0.21390437908665341,-0.41180199020062258,-0.11975874301407295,-0.3413572830988989,-0.014337887291734644,-0.26247586476052853,-0.44814572555075455,-0.17414732433395128,} -cl-std=CL2.0 -cl-finite-math-only "
2020-12-17 17:00:22 GeForce GTX 1070 Ti-0 77936867
2020-12-17 17:00:22 GeForce GTX 1070 Ti-0 77936867 OpenCL compilation in 0.00 s
2020-12-17 17:00:22 GeForce GTX 1070 Ti-0 77936867 maxAlloc: 8.0 GB
2020-12-17 17:00:22 GeForce GTX 1070 Ti-0 77936867 P1(0) 0 bits
2020-12-17 17:00:24 GeForce GTX 1070 Ti-0 77936867 OK 72400 on-load: blockSize 400, 70200e82a481024e
2020-12-17 17:00:24 GeForce GTX 1070 Ti-0 77936867 validating proof residues for power 8
2020-12-17 17:00:24 GeForce GTX 1070 Ti-0 77936867 Proof using power 8
2020-12-17 17:00:28 GeForce GTX 1070 Ti-0 77936867 OK 73200 0.09% bcb51c9036cb04da 3438 us/it + check 1.45s + save 0.12s; ETA 3d 02:21
2020-12-17 17:00:52 GeForce GTX 1070 Ti-0 77936867 80000 0.10% 8d76071d27ee4221 3448 us/it
2020-12-17 17:01:27 GeForce GTX 1070 Ti-0 77936867 90000 0.12% 0bacff453b2f470e 3464 us/it
2020-12-17 17:02:01 GeForce GTX 1070 Ti-0 77936867 100000 0.13% 6d7296b9e2830f50 3476 us/it
2020-12-17 17:02:04 GeForce GTX 1070 Ti-0 77936867 Stopping, please wait..
2020-12-17 17:02:06 GeForce GTX 1070 Ti-0 77936867 OK 100800 0.13% 1a5ad3d1c442af96 3501 us/it + check 1.46s + save 0.11s; ETA 3d 03:42
2020-12-17 17:02:06 GeForce GTX 1070 Ti-0 Exiting because "stop requested"
2020-12-17 17:02:06 GeForce GTX 1070 Ti-0 Bye
|
|
|
|
|
|
|
#2638 |
|
"Mike"
Aug 2002
22×29×71 Posts |
5600 XT
Code:
2020-12-18 13:37:03 gfx1010-0 OpenCL compilation in 2.58 s 2020-12-18 13:37:04 gfx1010-0 77936867 OK 0 loaded: blockSize 400, 0000000000000003 2020-12-18 13:37:06 gfx1010-0 77936867 OK 800 0.00%; 1928 us/it; ETA 1d 17:44; 1579c241dc63eca6 (check 0.82s) 2020-12-18 13:43:38 gfx1010-0 77936867 OK 200000 0.26%; 1962 us/it; ETA 1d 18:22; f0b04b45b0855bd2 (check 0.83s) 2020-12-18 13:50:11 gfx1010-0 77936867 OK 400000 0.51%; 1961 us/it; ETA 1d 18:14; c03f94396a5aa29e (check 0.82s) 2020-12-18 13:56:45 gfx1010-0 77936867 OK 600000 0.77%; 1964 us/it; ETA 1d 18:11; b9decd65ca71b629 (check 0.82s) 2020-12-18 14:03:18 gfx1010-0 77936867 OK 800000 1.03%; 1964 us/it; ETA 1d 18:05; 21ebf3636148f663 (check 0.82s) 2020-12-18 14:09:52 gfx1010-0 77936867 OK 1000000 1.28%; 1964 us/it; ETA 1d 17:59; 9bf9d9e6bff4286e (check 0.83s) |
|
|
|
|
|
#2639 |
|
"Mike"
Aug 2002
202C16 Posts |
RX 560
Code:
2020-12-18 18:13:48 Baffin-0 OpenCL compilation in 1.74 s 2020-12-18 18:13:59 Baffin-0 77936867 OK 800 0.00%; 6983 us/it; ETA 6d 07:10; 1579c241dc63eca6 (check 2.79s) 2020-12-18 18:15:05 Baffin-0 77936867 OK 10000 0.01%; 6833 us/it; ETA 6d 03:55; fc4f135f7cf4ad29 (check 2.78s) 2020-12-18 18:16:16 Baffin-0 77936867 OK 20000 0.03%; 6832 us/it; ETA 6d 03:52; 3cd1bd9d5e09cbc5 (check 2.78s) 2020-12-18 18:17:27 Baffin-0 77936867 OK 30000 0.04%; 6832 us/it; ETA 6d 03:51; c4e0ff35e3290d98 (check 2.77s) 2020-12-18 18:18:38 Baffin-0 77936867 OK 40000 0.05%; 6831 us/it; ETA 6d 03:49; dffe1b1b0d748128 (check 2.78s) 2020-12-18 18:19:49 Baffin-0 77936867 OK 50000 0.06%; 6831 us/it; ETA 6d 03:48; 52e286945371ed29 (check 2.78s) 2020-12-18 18:21:01 Baffin-0 77936867 OK 60000 0.08%; 6832 us/it; ETA 6d 03:47; 0945da4dc08bdd95 (check 2.78s) |
|
|
|
|
|
#2640 |
|
Jul 2009
Germany
607 Posts |
Nvidia Tesla T4
Code:
2020-12-19 02:22:54 config: -carry short -use CARRY32,ORIG_SLOWTRIG,IN_WG=128,IN_SIZEX=16,IN_SPACING=4,OUT_WG=128,OUT_SIZEX=16,OUT_SPACING=4 -nospin -block 100 -maxAlloc 10000 -B1 750000 -rB2 20 -prp 57885161 2020-12-19 02:22:54 device 0, unique id '' 2020-12-19 02:22:54 Tesla T4-0 57885161 FFT: 3M 1K:6:256 (18.40 bpw) 2020-12-19 02:22:54 Tesla T4-0 Expected maximum carry32: 42500000 2020-12-19 02:22:54 Tesla T4-0 OpenCL args "-DEXP=57885161u -DWIDTH=1024u -DSMALL_HEIGHT=256u -DMIDDLE=6u -DPM1=0 -DMAX_ACCURACY=1 -DWEIGHT_STEP_MINUS_1=0x1.07673850f37p-1 -DIWEIGHT_STEP_MINUS_1=-0x1.5bd9e39e14a3dp-2 -DCARRY32=1 -DIN_SIZEX=16 -DIN_SPACING=4 -DIN_WG=128 -DORIG_SLOWTRIG=1 -DOUT_SIZEX=16 -DOUT_SPACING=4 -DOUT_WG=128 -cl-unsafe-math-optimizations -cl-std=CL2.0 -cl-finite-math-only " 2020-12-19 02:22:54 Tesla T4-0 2020-12-19 02:22:54 Tesla T4-0 OpenCL compilation in 0.01 s 2020-12-19 02:22:55 Tesla T4-0 57885161 OK 106400 loaded: blockSize 100, 4df015b749d81753 2020-12-19 02:22:55 Tesla T4-0 validating proof residues for power 8 2020-12-19 02:22:55 Tesla T4-0 Proof using power 8 2020-12-19 02:22:56 Tesla T4-0 57885161 OK 106600 0.18%; 3074 us/it; ETA 2d 01:20; 09d009921e54293a (check 0.42s) 2020-12-19 02:27:59 Tesla T4-0 57885161 OK 200000 0.35%; 3243 us/it; ETA 2d 03:58; de62d6db1ad5092d (check 0.41s) 2020-12-19 02:39:03 Tesla T4-0 57885161 OK 400000 0.69%; 3319 us/it; ETA 2d 04:59; 45e043b36f3556e1 (check 0.41s) 2020-12-19 02:39:55 Tesla T4-0 Stopping, please wait.. 2020-12-19 02:39:56 Tesla T4-0 57885161 OK 415600 0.72%; 3318 us/it; ETA 2d 04:59; 82ffeed94bb310b9 (check 0.42s) 2020-12-19 02:39:56 Tesla T4-0 Exiting because "stop requested" 2020-12-19 02:39:56 Tesla T4-0 Bye |
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1676 | 2021-06-30 21:23 |
| GPUOWL AMD Windows OpenCL issues | xx005fs | GpuOwl | 0 | 2019-07-26 21:37 |
| Testing an expression for primality | 1260 | Software | 17 | 2015-08-28 01:35 |
| Testing Mersenne cofactors for primality? | CRGreathouse | Computer Science & Computational Number Theory | 18 | 2013-06-08 19:12 |
| Primality-testing program with multiple types of moduli (PFGW-related) | Unregistered | Information & Answers | 4 | 2006-10-04 22:38 |