![]() |
TPSieve-CUDA v0.2.1 is out, in the usual download location. (See the first post in the thread.) It should be a little faster than 0.2.0 on the GPU, and a [b]lot[/b] lighter on the CPU, particularly on Windows, and even more so on Linux 32-bit. Hopefully this will allow a return to the higher speeds of the past on this sieve.
P.S. No, the meaning of -m didn't change. |
I've just re-thought the CPU multiplication, and I think I'm going to have to pull this build. I'll try again tomorrow. :sad:
|
Alright, v0.2.1a is uploaded. It may use 15-30% more CPU, but at least it won't print incorrect results. :smile: (I hope. :blush:)
|
And now v0.2.2 is out. It ought to be faster than v0.2.1a, if only because it's more efficient with the CPU. But it's also more efficient with GPUs, particularly older ones. :smile:
|
And I've released v0.2.2b, which fixes some bugs that could cause factors to be missed. This is why I asked for testing!
|
And one more bugfix, 0.2.2c, to restore about 1/10000 missing factors.
|
Can anyone tell me what would be the optimal settings for GTX 465.
i keep getting 202.6M p/sec 0.69 CPU cores but my GPU is barely used at 15% how can i have it be utilized closed to 60%-70% or higher. Also there is no CUDA windows 64 bit? Thanks. |
[QUOTE=cipher;238492]Can anyone tell me what would be the optimal settings for GTX 465.
i keep getting 202.6M p/sec 0.69 CPU cores but my GPU is barely used at 15% how can i have it be utilized closed to 60%-70% or higher. [/quote] Let's see. Start with: -m 16384 -Q 10e6 Then try doubling -m until it either slows down or crashes! Then try running two processes at once. On GTX465 and higher, that should improve performance. You might have to lower -m to do this, though. And one more thing that's probably [b]better than all these other suggestions combined[/b]: Try reserving the same range across a larger N range. The main reason your GPU isn't used more is that it finishes its N range, for all given P's, before the next set of P's can be computed! Reserving the entire 480000-500000 range, and using two processes without any new flags, should in theory use 100% of your GPU! [QUOTE=cipher;238492] Also there is no CUDA windows 64 bit? [/QUOTE] No. There's no free 64-bit compiler from Microsoft. And on most other projects it doesn't matter. |
I have a twin sieve file created with NewPGen. I'd like to continue sieving on my GPU, but it says "invalid header in input file". How can I convert one format to the other?
Thanks Peter |
| All times are UTC. The time now is 23:27. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.