![]() |
|
|
#12 | |
|
Oct 2010
191 Posts |
Quote:
). The GPU ran at the standard (factory overclocked) settings @ 725 MHz Core / 1450 MHz Shaders / 900 MHz RAM (GDDR 5) @ 1.012V core (currently no working Linux overclocking but the card crunched without any problems @ 850 MHz / 1700 MHz / 1000 MHz under Windows). The temps during the test were around 45-48 degrees Celsius with the case closed.The Q9550 runs with 1.1625V @ 2.83 GHz stock clock and with 1.20V @ 3.4 GHz. It's possible to run 3.4 GHz at lower voltages but I wanted to eliminate any risk of returning wrong results while crunching on PRP or LLR net. Last fiddled with by Ralf Recker on 2011-01-09 at 11:32 |
|
|
|
|
|
|
#13 |
|
Oct 2010
191 Posts |
I have two questions: Which OS and drivers did you use? Was the CPU otherwise idle or under load? The runtime differences (compared to my GTX 460) are significant. If your CPUs were under load it could be caused by GPU starvation...
Last fiddled with by Ralf Recker on 2011-01-09 at 11:31 |
|
|
|
|
|
#14 |
|
Jul 2009
Tokyo
2·5·61 Posts |
Dear Mr. Jean Penné
Happy New Year to you! I wish this year will be the happiest and best for you. Yours sincerely, Shoichiro Yamada Last fiddled with by msft on 2011-01-09 at 13:31 |
|
|
|
|
|
#15 | |
|
Jul 2009
Tokyo
2·5·61 Posts |
Hi ,Ralf Recker
Quote:
driver:devdriver_3.2_linux_64_260.19.26.run exec mprime same time. |
|
|
|
|
|
|
#16 |
|
Oct 2010
19110 Posts |
I've installed the 260.19.26 dev drivers and the 260.19.29 drivers (and the CUDA SDK 3.2) after my initial test (256.53 drivers and the CUDA SDK 3.1) and experienced a significant slowdown of the (BOINC/PrimeGrid) CWPSieve (CUDA) workunits from 2.5 minutes/WU to around 3.75 minutes/WU (CPU idle) or even 4.5 minutes (CPU under load - Sieving for NFS@Home). Since (older?) versions of Ken-g6's PPSieve (CUDA) and TPSieve (CUDA) are also working much better with the 256 drivers I've downgraded the drivers to 256.53 again (and the SDK to 3.1).
Last fiddled with by Ralf Recker on 2011-01-09 at 13:56 |
|
|
|
|
|
#17 | |
|
Mar 2010
3×137 Posts |
Quote:
NV messed something up in 260.xx . OpenCL and CUDA(at least the Driver API ones) apps, compiled with 260.xx, dont work on older drivers(like 258.96). OpenCL suffers major slowdown(up to 25%) in performance, while CUDA up to 5%. That's my observations on applications I use. What's curious, using 260.xx and 3.1 toolkit/sdk doesnt solve anything, so it's definitely the drivers. 258.96 are the last proper drivers for windows. I use them ![]() For Linux, yeah, 256.53 are last proper too. Last fiddled with by Karl M Johnson on 2011-01-09 at 14:18 |
|
|
|
|
|
|
#18 |
|
Jul 2009
Tokyo
2×5×61 Posts |
$ time ./llrCUDA -q5*2^1282755+1
real 19m42.907s user 8m58.070s sys 7m21.650s $ cat lresults.txt 5*2^1282755+1 is prime! Time : 1182.847 sec. |
|
|
|
|
|
#19 | |
|
Oct 2010
BF16 Posts |
Looks like it reached the speed of a single CPU core
![]() ralf@quadriga ~/llrcuda.0.11 $ time ./llrCUDA -q5*2^1282755+1 -d Starting Proth prime test of 5*2^1282755+1, FFTLEN = 131072 ; a = 3 5*2^1282755+1, bit: 40000 / 1282757 [3.11%]. Time per bit: 0.819 ms. To quote myself: Quote:
Update: ralf@quadriga ~/llrcuda.0.11 $ time ./llrCUDA -q5*2^1282755+1 -d Starting Proth prime test of 5*2^1282755+1, FFTLEN = 131072 ; a = 3 5*2^1282755+1 is prime! Time : 1056.916 sec. real 17m37.006s user 4m51.894s sys 9m27.603s ralf@quadriga ~/llrcuda.0.11 $ cat lresults.txt Bit 32105 / 1282757 5*2^1282755+1 is prime! Time : 1056.916 sec. Last fiddled with by Ralf Recker on 2011-01-09 at 16:29 |
|
|
|
|
|
|
#20 | |
|
Oct 2010
191 Posts |
Quote:
Code compiled for sm_20 shows no significant difference: ralf@quadriga ~/llrcuda.0.11 $ time ./llrCUDA -q5*2^1282755+1 -d Starting Proth prime test of 5*2^1282755+1, FFTLEN = 131072 ; a = 3 5*2^1282755+1 is prime! Time : 1053.793 sec. while code compiled for sm_21 is noticeably slower: Time per bit: 0.865 ms Last fiddled with by Ralf Recker on 2011-01-09 at 17:20 |
|
|
|
|
|
|
#21 |
|
Sep 2004
B0E16 Posts |
Next step is to be 4x faster than CPU.
EDIT: Where can I find a list of all Nvidia GTX TDP's? Last fiddled with by em99010pepe on 2011-01-09 at 19:02 |
|
|
|
|
|
#22 |
|
Sep 2004
283010 Posts |
For
GPU: GTX470 TDP of 215 W CPU: Q9550 TDP of 121 W ( 4 cores at 3.6 GHz) CPU timing for 5*2^1282755+1 of 1056 secs Then the GPU client should be 7.1 times faster than the CPU version to achieve the same relation Watts/candidates of CPU LLR client in 24 hours. This is the optimal for this pair, Q9550 and GTX470. Last fiddled with by em99010pepe on 2011-01-09 at 20:00 |
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| LLRcuda | shanecruise | Riesel Prime Search | 8 | 2014-09-16 02:09 |
| LLRCUDA - getting it to work | diep | GPU Computing | 1 | 2013-10-02 12:12 |