mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   GPU Computing (https://www.mersenneforum.org/forumdisplay.php?f=92)
-   -   CUDALucas (a.k.a. MaclucasFFTW/CUDA 2.3/CUFFTW) (https://www.mersenneforum.org/showthread.php?t=12576)

axn 2015-03-23 14:25

[QUOTE=Karl M Johnson;398407]Got some fresh info about GTX 780 Ti.
[CODE]
69M exp
1045 MHZ GPU clock, 3196 MHz mem clock, iteration time: 4.22 ms/iter
1176 MHz, 3570 MHz mem clock, iteration time: 3.7 ms/iter
[/CODE]

With my GTX Titan, I get 2.76ms/iteration on the same exponent.

Promising data :smile:[/QUOTE]

Hmmm... Mersenne.ca has been updated with new numbers, and now things have swung the other way. 780 Ti has apparently suffered a reduction in performance (how?) and Titan is now second. I think Titan Black and Titan Z also needs to be rebenchmarked.

James Heinrich 2015-03-23 14:42

[QUOTE=Karl M Johnson;398407]Got some fresh info about GTX 780 Ti.[/QUOTE]If you have access to that 780 Ti I would much appreciate a benchmark.

[QUOTE=axn;398408]I think Titan Black and Titan Z also needs to be rebenchmarked.[/QUOTE][QUOTE=James Heinrich;398294]What I need now is more benchmarks from both 780 and Titan Black to figure that part out too. Benchmarks from Titan Z and Titan X would great too[/QUOTE]I would tend to agree with you :smile:

stars10250 2015-03-24 04:00

Device GeForce GTX 780 Ti
Compatibility 3.5
clockRate (MHz) 1019
memClockRate (MHz) 3574

fft max exp ms/iter
3136 58404433 3.0522
3200 59570449 3.2748
3240 60298969 3.5230
3584 66556463 3.6101
4096 75846319 3.6664
4608 85111207 4.5590
4800 88579669 5.1143
4860 89662967 5.5272
4900 90384989 5.7338
5000 92189509 5.8190

Threads
3500 256 32 3.77875
3528 256 512 3.77450
3584 256 512 3.60728
3600 256 256 3.85989
3645 256 128 4.17807
3675 128 128 4.60588
3750 256 64 4.63702
3780 256 64 4.21302
3840 256 512 4.23805
3888 256 128 4.03913
3920 256 64 4.29968
3969 256 64 4.51075
4000 256 128 3.93409
4032 256 512 4.17210
4050 256 64 4.53155
4096 256 64 3.66282
4116 256 32 5.14177
4200 256 128 4.98391
4320 128 512 4.65157
4374 128 256 5.18404
4375 128 128 5.32028
4410 256 32 5.00846
4480 256 256 4.77655
4500 256 128 4.82445

axn 2015-03-24 04:21

New numbers for Titan X have appeared at mersenne.ca. 30% more thruput compared to 980.

The TDP for Titan Z is wrong at the site. It should be 375w, not 500w.

James Heinrich 2015-03-24 11:35

[QUOTE=James Heinrich;398410]If you have access to that 780 Ti I would much appreciate a benchmark.[/QUOTE]And by 780 Ti what I really meant was GTX 980.
(but thanks [i]stars10250[/i]).

[QUOTE=axn;398463]The TDP for Titan Z is wrong at the site. It should be 375w, not 500w.[/QUOTE]Fixed, thanks.

vsuite 2015-04-02 23:29

32 bit Binary
 
Is there a 32-bit recent binary for CUDA4.2 please?

flashjh 2015-04-02 23:53

[QUOTE=vsuite;399243]Is there a 32-bit recent binary for CUDA4.2 please?[/QUOTE]

[URL="http://downloads.sourceforge.net/project/cudalucas/CUDALucas.2.05.1-CUDA4.2-CUDA6.5-Windows-32.64.7z?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Fcudalucas%2F&ts=1428018767&use_mirror=iweb"]Here[/URL] you go.

vsuite 2015-04-03 02:37

Thanks flashjh.

Now back in GIMP, and 2.03 appeared to only have 64bit bins, so I did not download 2.05. :ermm::whistle:

GT 640 on 7/64 (X2 processor) slow, so mfaktc only.
GTX 460 on XP/32 (Core 2 Quad) will be both CudaLucas and mfaktc.

bpcvdhelm 2015-04-06 11:00

Compiler optimalisation
 
Hi guys,

My computer is running mprime for some time and for fun I started CUDALucas beside it. It runs great on my GTX970!

I translated the CUDALucas source on my computer. And after some inspection in the Makefile I see:
NAME = CUDALucas
VERSION = 2.05.1
OptLevel = 1

I did some test with OptLevel 3 and here are the results (tiny test, btw the production CUDALucas is running in the background)):

With OptLevel = 1:
[FONT=Courier New][SIZE=1]| Date Time | Test Num Iter Residue | FFT Error ms/It Time | ETA Done |
| Apr 06 11:29:32 | M110503 10000 0xacb29fc05973d0a8 | 8K 0.00001 0.2062 2.06s | 0:20 9.04% |
| Apr 06 11:29:34 | M110503 20000 0x9cd7ca8aa594b33c | 8K 0.00001 0.2060 2.06s | 0:18 18.09% |
| Apr 06 11:29:36 | M110503 30000 0xba1ef4f09a7c955a | 8K 0.00001 0.2062 2.06s | 0:16 27.14% |
| Apr 06 11:29:38 | M110503 40000 0x827b27dad4e98554 | 8K 0.00001 0.2060 2.06s | 0:14 36.19% |
| Apr 06 11:29:40 | M110503 50000 0x9e6c039053cc2c17 | 8K 0.00001 0.2061 2.06s | 0:12 45.24% |
| Apr 06 11:29:42 | M110503 60000 0xdb48afced9ebd397 | 8K 0.00001 0.2060 2.06s | 0:10 54.29% |
| Apr 06 11:29:44 | M110503 70000 0xd650094b406761ed | 8K 0.00001 0.2061 2.06s | 0:08 63.34% |
| Apr 06 11:29:46 | M110503 80000 0xa4d69c031cb0caa2 | 8K 0.00001 0.2060 2.06s | 0:06 72.39% |
| Apr 06 11:29:48 | M110503 90000 0xf1427358e52c1458 | 8K 0.00001 0.2060 2.06s | 0:04 81.44% |
| Apr 06 11:29:50 | M110503 100000 0x0f4385fec05eb193 | 8K 0.00001 0.2060 2.06s | 0:02 90.49% |
| Apr 06 11:29:52 | M110503 110000 0xc5bb3186236db9db | 8K 0.00001 0.2061 2.06s | 0:00 99.54% |[/SIZE][/FONT]

But with OptLevel = 3:
[FONT=Courier New][SIZE=1]| Date Time | Test Num Iter Residue | FFT Error ms/It Time | ETA Done |
| Apr 06 11:30:19 | M110503 10000 0xacb29fc05973d0a8 | 8K 0.00001 0.2058 2.05s | 0:20 9.04% |
| Apr 06 11:30:21 | M110503 20000 0x9cd7ca8aa594b33c | 8K 0.00001 0.2058 2.05s | 0:18 18.09% |
| Apr 06 11:30:23 | M110503 30000 0xba1ef4f09a7c955a | 8K 0.00001 0.2058 2.05s | 0:16 27.14% |
| Apr 06 11:30:25 | M110503 40000 0x827b27dad4e98554 | 8K 0.00001 0.2057 2.05s | 0:14 36.19% |
| Apr 06 11:30:27 | M110503 50000 0x9e6c039053cc2c17 | 8K 0.00001 0.2059 2.05s | 0:12 45.24% |
| Apr 06 11:30:29 | M110503 60000 0xdb48afced9ebd397 | 8K 0.00001 0.2057 2.05s | 0:10 54.29% |
| Apr 06 11:30:31 | M110503 70000 0xd650094b406761ed | 8K 0.00001 0.2046 2.04s | 0:08 63.34% |
| Apr 06 11:30:34 | M110503 80000 0xa4d69c031cb0caa2 | 8K 0.00001 0.2057 2.05s | 0:06 72.39% |
| Apr 06 11:30:36 | M110503 90000 0xf1427358e52c1458 | 8K 0.00001 0.2059 2.05s | 0:04 81.44% |
| Apr 06 11:30:38 | M110503 100000 0x0f4385fec05eb193 | 8K 0.00001 0.2058 2.05s | 0:02 90.49% |
| Apr 06 11:30:40 | M110503 110000 0xc5bb3186236db9db | 8K 0.00001 0.2059 2.05s | 0:00 99.54% |[/SIZE][/FONT]

Why isn' t the setting standard at 3? See the man pages for gcc!

PS: For now I leave my (production) CUDALucas on OptLevel=1.
PS2: I have some experience with C programming. I was one of the hercules-390 (mainframe emulator) developer for 12 years. My expertise was performance, maybe I can help?

Kind regards,

Bernard van der Helm

MacFactor 2015-07-13 02:29

No help here ? Is ANYONE running CUDALucas under Linux Mint ?
 
I haven't done a compile outside of an IDE, but if someone will give me a clue what a 'make' statement (which switches, options) would look like I'll give it a try -- I hate to have a couple of GPUs underutilized just because the OS can't find a file which is sitting right there.

MF

Dubslow 2015-07-13 04:38

Could you be more specific what the error is, what you've already tried?


All times are UTC. The time now is 23:03.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.