![]() |
Hi sanzo,
your Geforce 210 is based on the GF218 chip. GF218 has compute capability 1.2 which means [B]no[/B] double precission. So this Chip can't run msft's code. And I didn't noticed any Windows binary of this code. This GPU hasn't much computing power compared to highend GPUs. The difference of a current lowend GPU to a current highend GPU ist much higher than it is on CPU. [B]Simplyfied comparison:[/B] CPU lowend: 2 cores 2GHz CPU highend: 6 cores 3.3GHz (6 * 3.3GHz) / (2 * 2GHz) = ~5 times GPU lowend: (Geforce 205) 8 CUDA cores @ 1402MHz GPU highend: (Geforce GTX 480) 480 CUDA cores @ 1401MHz difference: ~60 times Oliver |
Thanks [URL="http://www.mersenneforum.org/member.php?u=1696"]TheJudger [/URL]
In this case I wait a softw that support my ATI 5870 :) thanks all!! |
[QUOTE=TheJudger;226993]Hi frmky,
[QUOTE=frmky]I don't think I can. This is a Linux compute node with no X installed. nvidia-settings complains about the lack of libX. nvidia-smi doesn't seem to be able to adjust the memory clock. Do you know of a Linux command line utility that will adjust it? [/QUOTE] no X is my problem, too. :smile: I can try on my private computer next weekend (GTX 470). Oliver[/QUOTE] I forget to mention that I was unable to change the clock rates of my GF100 with 256.35 and 256.40 driver... Google says other can't, too. :sad: Oliver |
[QUOTE=frmky;226773]Really? I was expecting much better than that. From the GTX 260 all the way up to the GTX 480, the speed has scaled linearly with the frequency and number of DP units with no sign of being bandwidth limited. On the GTX 480, I'm getting nearly the same speed as you have posted using a 64-bit binary.
Edit: To rule out a weird compiler issue, can you try the binary at [URL="http://physics.fullerton.edu/gchilders/verS.tar.gz"]http://physics.fullerton.edu/gchilders/verS.tar.gz[/URL]? I've included the CUDA library files, so you can run with, for example, LD_LIBRARY_PATH=. ./MacLucasFFTW 24036583 to test the 2M FFT.[/QUOTE] With Greg's binary our Nvidia C2050 (fermi) reports 4.371 ms/iteration for the 2M FFT with ECC off, starting with 24036583; which would appear to confirm "TheJudger"'s timing. A test with 4M crashed, but it's just been unpacked; so we may need to load some more libraries. Two boards. And the view here is that the GPUs won't be useful for Mersenne primes? The boinc projects with GPU applications run circles around CPU aps (such as sieving under NFS@Home). I'm just over 28M credits of CPU computing on sieving from almost a year; while people running GPU apps are getting 1M/day. Not that I'm seeing anything useful from Collatz, et. al.; I'm interested in NFS polyn selection with msieve. -Bruce |
Hi Bruce,
you're still faster than a current quadcore desktop with a single GTX 480. But the speedup is smaller than in other projects. Perhaps because Georges CPU implementation is tweaked very very well and msft's GPU code is "just using a generic FFT implementation" ([B]no[/B] offense on you, msft!). Perhaps an LL test isn't suited well for GPUs. Oliver |
[QUOTE=frmky;194992]Version k runs at .0141 sec/iter for the 2048K FFT and .0264 sec/iter for the 4096K FFT on the C1060.[/QUOTE]
Uhm, I looked up sec <---> ms and got a report that 4.371 ms/iter translates to 0.004371 sec/iter. Not sure about the other comparisions, but tessla/fermi seems to be doing well relative to tessla1, yes? (that's c1060 -vs- c2050). About 3 times quicker? -bd ("Mr Obvious") |
[QUOTE=frmky;226773]Really? I was expecting much better than that. From the GTX 260 all the way up to the GTX 480, the speed has scaled linearly with the frequency and number of DP units with no sign of being bandwidth limited. On the GTX 480, I'm getting nearly the same speed as you have posted using a 64-bit binary.
Edit: To rule out a weird compiler issue, can you try the binary at [URL="http://physics.fullerton.edu/gchilders/verS.tar.gz"]http://physics.fullerton.edu/gchilders/verS.tar.gz[/URL]? I've included the CUDA library files, so you can run with, for example, LD_LIBRARY_PATH=. ./MacLucasFFTW 24036583 to test the 2M FFT.[/QUOTE] 10.4 ms/iteration on my GTX275 :smile: Luigi |
But only 11.283 ms/iteration for 35000293! :smile:
Luigi |
[QUOTE=bdodson;228298]
The boinc projects with GPU applications run circles around CPU aps (such as sieving under NFS@Home). I'm just over 28M credits of CPU computing on sieving from almost a year; while people running GPU apps are getting 1M/day. Not that I'm seeing anything useful from Collatz, et. al.; I'm interested in NFS polyn selection with msieve. -Bruce[/QUOTE] Note that with Paul Zimmermann's help I've figured out a lot more about how Kleinjung's improved algorithm works, and if I ever get the time to overhaul the CPU code then polynomial selection with msieve can be made much more efficient. The problem is that the changes I'm considering will not work well with a GPU (i.e. using big hashtables) |
[QUOTE=bdodson;228298]And the view here is that the GPUs won't be useful
for Mersenne primes? [/QUOTE] I was hoping for a bit faster, but to be realistic it is "only" the fastest single LL test that I'm aware of right now. Not to mention that the CPU use is tiny so it can be run in parallel with calculations on the CPU. Granted it's not 50x or 100x CPU, but it's still fast! :smile: I don't have time to put into it right now, but with George's blessing it wouldn't be difficult to incorporate this into a BOINC project. PrimeGrid would probably be the best home if they're interested. It would probably give GIMPS quite a boost. |
Well, my friend finally got his GTX 460 received and installed. I've installed the CUDA SDK tools and successfully compiled MacLucasFFTW. The problem, though, is when I try to run it:
[code] gary@Buttford:~/Desktop/gpu-stuff/MacLucasFFTW$ ./MacLucasFFTW ./MacLucasFFTW: error while loading shared libraries: libcudart.so.3: cannot open shared object file: No such file or directory[/code] (yes, ignore the computer name...it's a long story :razz:) I tried pulling the libcudart.so.3.1.9 file (which is linked to from libcudart.so.3) out of the /usr/local/cuda/lib directory, putting it in the MacLucasFFTW directory, and renaming it libcudart.so.3. Yet I still get the error. Anybody have an idea what's going on here? |
| All times are UTC. The time now is 22:42. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.