mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2018-09-03, 20:39   #2685
ECPilot
 
"Eric Clements"
Mar 2017
United Kingdom

110 Posts
Default

tdulcet, the install script works beautifully for CUDALucas on Ubuntu laptops. Thank you for this.
ECPilot is offline   Reply With Quote
Old 2018-09-27, 11:15   #2686
Lorenzo
 
Lorenzo's Avatar
 
Aug 2010
Republic of Belarus

2·89 Posts
Default

Hello! Is there truth http://www.mersenne.ca/cudalucas.php?model=745? As for me this is doubtful result because performance of this card near the Titan V which has 1/2 DP (unlike 2080i that has only 1/32) .

Can someone confirm this?
Lorenzo is offline   Reply With Quote
Old 2018-09-27, 14:24   #2687
tServo
 
tServo's Avatar
 
"Marv"
May 2009
near the Tannhäuser Gate

3·269 Posts
Default

Quote:
Originally Posted by Lorenzo View Post
Hello! Is there truth http://www.mersenne.ca/cudalucas.php?model=745? As for me this is doubtful result because performance of this card near the Titan V which has 1/2 DP (unlike 2080i that has only 1/32) .

Can someone confirm this?
It doesn't make sense to me either. The best predictor for performance should be the Gflops(DP) column. Thus, the entry for GTX 1080 TI looks suspect also.
tServo is offline   Reply With Quote
Old 2018-09-27, 14:24   #2688
xx005fs
 
"Eric"
Jan 2018
USA

223 Posts
Default

Quote:
Originally Posted by Lorenzo View Post
Hello! Is there truth http://www.mersenne.ca/cudalucas.php?model=745? As for me this is doubtful result because performance of this card near the Titan V which has 1/2 DP (unlike 2080i that has only 1/32) .

Can someone confirm this?
I realized too, and I am definitely thinking that this is a placeholder because there is no way that with so much deficit on the 2080ti in terms of DP compared to Titan V it's only within 20%. We will have to see.
xx005fs is offline   Reply With Quote
Old 2018-09-27, 14:27   #2689
xx005fs
 
"Eric"
Jan 2018
USA

3378 Posts
Default

Quote:
Originally Posted by tServo View Post
It doesn't make sense to me either. The best predictor for performance should be the Gflops(DP) column. Thus, the entry for GTX 1080 TI looks suspect also.
The AMD cards on the site seems really slow, or is it because they are running clLucas rather than Gpuowl? I think that the AMD cards should have the gpuowl speed on it and that would reflect real world performance better because it is significantly faster. For example, on the site for 85M exponents it says vega 64 liquid gets 3.5 ms/it, however, my vega 56 undervolted to 1480/1080 runs at 2.05 ms/it on gpuowl and that's nearly 40% faster.
xx005fs is offline   Reply With Quote
Old 2018-10-05, 22:25   #2690
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

5·223 Posts
Default

CUDA 10.0130, CUDA driver 410.57, CUDALucas 2.05.1 (SVN rev. 99)

Benchmark FFT sizes './CUDALucas -cufftbench 2048 32768 20'
Code:
Device              GeForce RTX 2080 Ti
Compatibility       7.5
clockRate (MHz)     1635
memClockRate (MHz)  7000

  fft    max exp  ms/iter
 2048   38492887   1.0627
 2304   43194913   1.2857
 2592   48471289   1.5457
 2700   50446621   1.7518
 2744   51250889   1.8380
 2880   53735041   1.8585
 3200   59570449   1.8635
 3456   64229677   1.9027
 4096   75846319   2.0373
 4608   85111207   2.6288
 5184   95507747   3.0856
 5400   99399967   3.5222
 5760  105879517   3.5851
 5832  107174381   3.6607
 6400  117377567   3.8294
 6912  126558077   3.9194
 7168  131142761   4.5456
 8192  149447533   4.8361
 8748  159365399   5.5607
 9216  167703023   5.5871
10368  188188471   5.8853
10584  192023851   7.2167
11520  208624903   7.3206
11664  211176269   7.6460
12544  226753511   7.8723
12800  231280639   7.9574
13824  249369863   8.1618
16384  294471259   9.3144
17496  314013451  11.1805
18432  330441847  11.7295
20736  370806323  12.0643
21952  392070229  14.8799
22400  399897793  15.1170
23040  411074273  15.1771
24192  431175197  15.5346
25088  446794913  16.3743
26244  466929581  17.0977
27648  491358173  17.8123
32768  580225813  18.1072
And benchmark for mersenne.ca: './CUDALucas 57885161'
Code:
|   Date     Time    |   Test Num     Iter        Residue        |    FFT   Error     ms/It     Time  |       ETA      Done   |
|  Oct 05  23:53:14  |  M57885161     10000  0x76c27556683cd84d  |  3200K  0.10156   1.8599   18.59s  |   1:05:54:02   0.01%  |
|  Oct 05  23:53:33  |  M57885161     20000  0xfd8e311d20ffe6ab  |  3200K  0.10156   1.8648   18.64s  |   1:05:56:06   0.03%  |
|  Oct 05  23:53:52  |  M57885161     30000  0xce0d85ab0065a232  |  3200K  0.10156   1.8695   18.69s  |   1:05:58:05   0.05%  |
[...]
|  Oct 05  23:57:39  |  M57885161    150000  0x8e9733fee4029132  |  3200K  0.09375   1.8939   18.93s  |   1:06:17:10   0.25%  |
|  Oct 05  23:57:58  |  M57885161    160000  0x0b5dadf12ed96a4d  |  3200K  0.10156   1.8932   18.93s  |   1:06:17:09   0.27%  |
|  Oct 05  23:58:17  |  M57885161    170000  0x69754eac9cc190a5  |  3200K  0.10938   1.8932   18.93s  |   1:06:17:05   0.29%  |
~220W and 1860-1875MHz on average once GPU is heated up.

And 100M digits (manually set FFT size): './CUDALucas -f 20736K 332192879'
Code:
|   Date     Time    |   Test Num     Iter        Residue        |    FFT   Error     ms/It     Time  |       ETA      Done   |
|  Oct 06  00:16:45  | M332192879     10000  0xa19043095e213f4c  | 20736K  0.01953  12.1000  121.00s  |  46:12:30:39   0.00%  |
|  Oct 06  00:18:47  | M332192879     20000  0xcb7bc66ac81b24be  | 20736K  0.01758  12.1699  121.69s  |  46:15:42:00   0.00%  |
|  Oct 06  00:20:48  | M332192879     30000  0x38e4cc517de8fda3  | 20736K  0.01758  12.1660  121.66s  |  46:16:37:11   0.00%  |
Oliver
TheJudger is offline   Reply With Quote
Old 2018-10-06, 00:47   #2691
xx005fs
 
"Eric"
Jan 2018
USA

DF16 Posts
Default Wrong Result for Volta Architecture?

Hi Guys. I noticed that I am consistently getting 0x0000000000000000 for the residue for every iteration output on Nvidia Volta hardware. The error output for CUDALucas also says 0.00. This is both replicable with a Tesla V100 instance and a Titan V GPU. Is there any problem with the settings I use or do I have to do something else to fix it.

Last fiddled with by xx005fs on 2018-10-06 at 00:55
xx005fs is offline   Reply With Quote
Old 2018-10-06, 01:16   #2692
James Heinrich
 
James Heinrich's Avatar
 
"James Heinrich"
May 2004
ex-Northern Ontario

427710 Posts
Default

Quote:
Originally Posted by TheJudger View Post
And benchmark for mersenne.ca: './CUDALucas 57885161'
Thanks, benchmark page is updated now with the single result. Does this look about right, with 2080 significantly slower than V100?
https://www.mersenne.ca/cudalucas.php?filter=V100|2080
James Heinrich is offline   Reply With Quote
Old 2018-10-06, 08:48   #2693
Lorenzo
 
Lorenzo's Avatar
 
Aug 2010
Republic of Belarus

2·89 Posts
Default

Hello, Oliver!

Thank you very much for the bencmark!!!

Ehhh. Perfomance lower than GTX1080Ti. Really bad choice for today (for LL).
Lorenzo is offline   Reply With Quote
Old 2018-10-06, 12:54   #2694
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

1E9016 Posts
Default

Quote:
Originally Posted by xx005fs View Post
Hi Guys. I noticed that I am consistently getting 0x0000000000000000 for the residue for every iteration output on Nvidia Volta hardware. The error output for CUDALucas also says 0.00. This is both replicable with a Tesla V100 instance and a Titan V GPU. Is there any problem with the settings I use or do I have to do something else to fix it.
"Is there any problem with the settings I use or do I have to do something else to fix it."
It's hard to say, without knowing the CUDALucas version used, CUDA level used, exponent(s), fft length(s), or any settings you use when you see this behavior, whether you get any correct results on Volta, etc. Please provide some specifics of when you see this. Also when you don't. Also whether replication on other models is by continuation or restart from scratch or whatever.

Yes, 0x0 any printed residue before the last (or the iteration before that when exponent p>127) is a problem.
You could look through https://www.mersenneforum.org/showpo...24&postcount=3 for 0x0 cases.
It could be a bug for which there's a workaround patch available, a known bug with no known fix, a previously undocumented (newly found) bug, or a setting issue.

Last fiddled with by kriesel on 2018-10-06 at 12:59
kriesel is online now   Reply With Quote
Old 2018-10-06, 17:46   #2695
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

5×223 Posts
Default

Quote:
Originally Posted by xx005fs View Post
Hi Guys. I noticed that I am consistently getting 0x0000000000000000 for the residue for every iteration output on Nvidia Volta hardware. The error output for CUDALucas also says 0.00. This is both replicable with a Tesla V100 instance and a Titan V GPU. Is there any problem with the settings I use or do I have to do something else to fix it.
Had the same issue yesterday when running benchmarks on RTX 2080 Ti. In my case it was a combination of user error and lack of error checking. I had compiled CUDALucas for a quick & dirty benchmark only for sm75 (Turing). Than I decided to check performance of Volta with CUDA 10.0, too. The binary runs without any warnings/error messages, benchmarks showed an improvement of nearly 50% over my previous benchmark. But when during the first 30000 iterations of M57885161 for James I've noticed those 0x0000000000000000 and knew something is wrong. Recompiling CUDALucas for sm70 solved this issue and performance was back on the same level as CUDA 9.1/9.2.

I know it is not perfect but in mfaktc I do
Code:
cudaError = cudaGetLastError();
if(cudaError != cudaSuccess)
printf("ERROR: cudaGetLastError() returned %d: %s\n", cudaError, cudaGetErrorString(cudaError));
every now and then. From host those CUDA calls are asynchronous and so there is no return value and you have to ask for errors later. This would catch those types of errors easily and is the main (but not only) source of the famous
Code:
ERROR: cudaGetLastError() returned 8: invalid device function
in mfaktc for example here when running old binaries on GTX 2080.

Oliver

Last fiddled with by TheJudger on 2018-10-06 at 17:47
TheJudger is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Don't DC/LL them with CudaLucas LaurV Data 131 2017-05-02 18:41
CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 Brain GPU Computing 13 2016-02-19 15:53
CUDALucas: which binary to use? Karl M Johnson GPU Computing 15 2015-10-13 04:44
settings for cudaLucas fairsky GPU Computing 11 2013-11-03 02:08
Trying to run CUDALucas on Windows 8 CP Rodrigo GPU Computing 12 2012-03-07 23:20

All times are UTC. The time now is 15:23.


Fri Jul 7 15:23:49 UTC 2023 up 323 days, 12:52, 0 users, load averages: 0.87, 1.04, 1.07

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔