![]() |
|
|
#309 | |
|
"Mr. Meeseeks"
Jan 2012
California, USA
23×271 Posts |
Quote:
(btw, it only uses about 65-80% of my gpu when I run 1 instance... that might be normal ?) OpenCL device info name BeaverCreek (Advanced Micro Devices, Inc.) device (driver) version OpenCL 1.1 AMD-APP (851.4) (CAL 1.4.1646 (VM)) maximum threads per block 256 maximum threads per grid 16777216 number of multiprocessors 5 (400 compute elements (estimate for ATI GPUs)) clock rate 600MHz |
|
|
|
|
|
|
#310 | |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3×29×83 Posts |
Quote:
[offtopic]Welcome to the GPU to 72 team! Except it seems you haven't actually gotten work from the tool. You can find more info the GPU to 72 subforum somewhere around here. Happy crunching![/offtopic] |
|
|
|
|
|
|
#311 |
|
Oct 2011
Maryland
29010 Posts |
It is probably the issue that Bdot mentioned above. You should be at around 90% or so with one instance in my opinion, since it is obvious that your CPU can sieve way faster than your GPU can process.
|
|
|
|
|
|
#312 |
|
"Mr. Meeseeks"
Jan 2012
California, USA
23×271 Posts |
Ah, ok thanks :)
Oh, and also I was wondering is there any way to reduce the priority of it? I have to pause it every time I do a gpu-intensive program or game, Thanks :) (P.S.: Is there a way to automatically pull assignments? Right now I realized I'll have to manually get more once it gets done. )
|
|
|
|
|
|
#313 | ||
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
160658 Posts |
Quote:
Edit: As a holdover: Quote:
Last fiddled with by Dubslow on 2012-01-11 at 21:57 |
||
|
|
|
|
|
#314 | |
|
Nov 2010
Germany
10010101012 Posts |
Thanks for the device info, I'll put that on the wishlist ;-)
Quote:
While you can lower the priority as mentioned (but not built-in), it may not result in what you want. The reason is, that the priority setting applies to the CPU part only. On the GPU there is no such thing as priorities - it's all round-robin on the same level. mfakto tries to keep 5 blocks (tasks) in the GPU-queue, which can make the UI laggy, if window-movements for instance have to wait until these 5 tasks are processed. You can try two settings in mfakto.ini: GridSize defines of how many factor candidates one block will consist. Lowering this value should already increase responsiveness a lot at the expense of a little more CPU overhead. NumStreams is the number of blocks being scheduled. Lowering to 3 or 2 causes other tasks to be served quicker, but mfakto will have a smaller buffer to cover fluctuations in available CPU power. BTW, the relatively low GPU utilization can also occur if the CPU cores are rather busy. Sometimes the auto-adjusting of the SievePrimes value is confused if there is no CPU available to serve the GPU queue: the time it took to get the required CPU power is then wrongly interpreted as CPU idle time waiting for the GPU to finish. Try setting SievePrimesAdjust=0 and SievePrimes=100000 (to be tested what is good). Alternatively, set up two copies of mfakto to run in parallel. Then they can cover each other's gaps in GPU utilisation. |
|
|
|
|
|
|
#315 |
|
"James Heinrich"
May 2004
ex-Northern Ontario
23×149 Posts |
Can I request any mfakto users help me out with some benchmark data. I want to update my mfaktc table to include AMD GPUs as well, but I need some data to base it on. Please send me benchmarks on a wide variety of GPUs (very old to very new, and very slow to very fast) so that I can get as accurate a picture of how GFLOPS scales into GHz-days/day performance across the various products. For each GPU, I need the following 4 bits of data of a single running instance (even if you normally run multiple instances, please just run one for this test):
Please PM or email me the results as opposed to posting in this thread. I'll post back when I have enough data to make a reasonable chart. |
|
|
|
|
|
#316 | |
|
Oct 2011
Maryland
2·5·29 Posts |
Quote:
Put another way: I can always lower sieve primes and destroy my wall clock performance by increasing the number of candidates, but the reason performance is bad would be missed by your metrics, since the GPU usage would be the same. I will try to get you some values tomorrow. I have 6950's modded with shaders unlocked, and a 6570 I can hopefully get you tomorrow! |
|
|
|
|
|
|
#317 |
|
"James Heinrich"
May 2004
ex-Northern Ontario
23·149 Posts |
It's nice to have so I can see if I expect any given benchmark to be on the high or low side, but I don't need it per se for the calculations. There's enough (too much!) variance in the data (based on what I've seen of mfaktc data) that it doesn't really make much difference overall, my chart will just provide rough guidelines, +/-10% at best.
|
|
|
|
|
|
#318 | |
|
Oct 2011
Maryland
2×5×29 Posts |
Quote:
|
|
|
|
|
|
|
#319 | |
|
"James Heinrich"
May 2004
ex-Northern Ontario
23×149 Posts |
Quote:
You'll notice that the Radeon+mfakto combination is considerably less efficient at turning theoretical GFLOPS into GHz-days/day TF results than GeForce+mfaktc. Right now I'm using a divider of 18 (for mfaktc, I'm using 14 for older v1.x GPUs, 5 for v2.0 and 7.5 for v2.1). So that's why you see a Radeon 6990 and a GeForce GTX 570 both expecting ~282GHz-days/day, even though the 6990 has 5100 GFLOPS to the 570's 1400. More benchmark data is still welcome, especially from older/slower GPUs. |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| gpuOwL: an OpenCL program for Mersenne primality testing | preda | GpuOwl | 2719 | 2021-08-05 22:43 |
| mfaktc: a CUDA program for Mersenne prefactoring | TheJudger | GPU Computing | 3497 | 2021-06-05 12:27 |
| LL with OpenCL | msft | GPU Computing | 433 | 2019-06-23 21:11 |
| OpenCL for FPGAs | TObject | GPU Computing | 2 | 2013-10-12 21:09 |
| Program to TF Mersenne numbers with more than 1 sextillion digits? | Stargate38 | Factoring | 24 | 2011-11-03 00:34 |