mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2011-03-18, 09:13   #221
zyzhan
 
Feb 2011

38 Posts
Default

@ltd:
try this vs wizard to create cuda project
http://sourceforge.net/projects/cudavswizard/

Last fiddled with by zyzhan on 2011-03-18 at 09:14
zyzhan is offline   Reply With Quote
Old 2011-03-18, 10:36   #222
ltd
 
ltd's Avatar
 
Apr 2003

22×193 Posts
Default

@zyzhan: Thanks for the link. I will give it a try.

@pschoefer: With my machine (i7 920, W7 64Bit, GTX260 V267.24)
I observed the following. With seven threads doing other DC projects (4 Threads different BOINC projects and 3 threads normal CPU bound LLR)
each of these threads shows 13% load in the task manager. After the starting phase the llrcuda drops to around 2% and GPU-Z shows that the GPU runs between 94-96%. I will make some tests to see what happens if I add another CPU bound LLR thread.

What priority are the other tasks/threads running on?
To see if it makes a difference you could try to increase the priority of llrcuda from within the taskmanager to "aboveNormal".

Last fiddled with by ltd on 2011-03-18 at 10:38
ltd is offline   Reply With Quote
Old 2011-03-18, 14:06   #223
S34960zz
 
Feb 2011

22×13 Posts
Default

Quote:
Originally Posted by ltd View Post
@pschoefer: With my machine (i7 920, W7 64Bit, GTX260 V267.24)
I observed the following. With seven threads doing other DC projects (4 Threads different BOINC projects and 3 threads normal CPU bound LLR)
each of these threads shows 13% load in the task manager. After the starting phase the llrcuda drops to around 2% and GPU-Z shows that the GPU runs between 94-96%. I will make some tests to see what happens if I add another CPU bound LLR thread.

What priority are the other tasks/threads running on?
To see if it makes a difference you could try to increase the priority of llrcuda from within the taskmanager to "aboveNormal".
With 8 threads on a 4-core machine, you may be seeing cache contention (the two threads on each core share L1/L2 cache). Your overall throughput may improve if you back off the number of active threads, esp. if you are able to assign affinity for the threads to a particular core. (I looked at this on my i7-840QM using Prime95 v26.5, turns out 4 workers with 1 thread each was highest throughput, followed by 1 worker with 4 threads. My little parametric study: http://www.mersenneforum.org/showpos...0&postcount=83

Not sure of the memory/cache througput vs. CPU-bound for the applications you are running, but you may see a similar trend.
S34960zz is offline   Reply With Quote
Old 2011-03-18, 14:10   #224
S34960zz
 
Feb 2011

22×13 Posts
Default

Quote:
Originally Posted by pschoefer View Post
CPU load stayed that high.

After update to driver 267.24, CPU load went down to 1/8 of one core. Unfortunately, GPU load is only ~40%, if all CPU cores are under load. With one core idle, GPU load is at 98%.
Where are these latest drivers available (link please) ?
S34960zz is offline   Reply With Quote
Old 2011-03-18, 14:41   #225
pschoefer
 
pschoefer's Avatar
 
Jan 2007
.de

2×32 Posts
Default

Quote:
Originally Posted by ltd View Post
What priority are the other tasks/threads running on?
To see if it makes a difference you could try to increase the priority of llrcuda from within the taskmanager to "aboveNormal".
CPU was running BOINC (PG-PPS LLR), lowest priority according to taskmanager. Increasing the priority of llrcuda didn't help.

Quote:
Originally Posted by S34960zz View Post
Where are these latest drivers available (link please) ?
http://www.nvidia.com/object/win7-wi...ta-driver.html. It's still beta.
pschoefer is offline   Reply With Quote
Old 2011-03-18, 15:55   #226
ltd
 
ltd's Avatar
 
Apr 2003

22·193 Posts
Default

@zyzhan: Thanks again for the informations. Now my build runs also. As I thought it was a wrong CUDA build configuration.
ltd is offline   Reply With Quote
Old 2011-03-18, 23:17   #227
Mathew
 
Mathew's Avatar
 
Nov 2009

2·52·7 Posts
Default

using zyzhan's exe I tested the first and last prime in this thread. I got the following
C:\Users\mathew\Desktop\llrcuda.0.60.win64>llrcuda.exe -d -q"46157*2^698207+1"
Starting Proth prime test of 46157*2^698207+1
Using complex irrational base DWT, FFT length = 131072, a = 3

46157*2^698207+1 is prime! Time : 686.588 sec.. Time per bit: 0.888 ms. same as ltd

C:\Users\mathew\Desktop\llrcuda.0.60.win64>llrcuda.exe -d -q"5*2^23473+1"
too small Exponent... not same as msft

Edit: but the same as ltd's post

What is the min exponent size that can be tested?

Thanks everyone

Last fiddled with by Mathew on 2011-03-18 at 23:57 Reason: Reread the thread
Mathew is offline   Reply With Quote
Old 2011-03-19, 08:43   #228
x3mEn
 
Feb 2011

2×3 Posts
Default

Does llrcuda supports gpuaffinity?
Looks like not yet...
x3mEn is offline   Reply With Quote
Old 2011-03-19, 10:04   #229
Brain
 
Brain's Avatar
 
Dec 2009
Peine, Germany

331 Posts
Default GPU affinity

Quote:
Originally Posted by x3mEn View Post
Does llrcuda supports gpuaffinity?
Looks like not yet...
See here:
Quote:
Originally Posted by msft View Post
Support affinity.
This means yes..!?
Brain is offline   Reply With Quote
Old 2011-03-19, 10:41   #230
x3mEn
 
Feb 2011

2×3 Posts
Default

Quote:
Originally Posted by Brain View Post
See here:

This means yes..!?
Hm... GeneferCUDA really supports GPU affinity,
but llrcuda.0.60 doesn't... any idea?
x3mEn is offline   Reply With Quote
Old 2011-03-19, 13:57   #231
nuggetprime
 
nuggetprime's Avatar
 
Mar 2007
Austria

2·151 Posts
Default

This is a question to msft:
Is it possible to implement testing multiple candidates at the same time on one GPU? I think this would greatly improve throughput. Just like on a quad-core CPU you get about 3x more throughput if you test 4 candidates on 4 cores than 1 candidate on 4 cores.
nuggetprime is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
LLRcuda shanecruise Riesel Prime Search 8 2014-09-16 02:09
LLRCUDA - getting it to work diep GPU Computing 1 2013-10-02 12:12

All times are UTC. The time now is 18:12.

Fri Dec 4 18:12:52 UTC 2020 up 1 day, 14:24, 0 users, load averages: 1.77, 1.53, 1.59

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.