![]() |
|
|
#1200 | |
|
(loop (#_fork))
Feb 2006
Cambridge, England
11001000101112 Posts |
Quote:
Could you put up the result from 'grep apicid /proc/cpuinfo' ? |
|
|
|
|
|
|
#1201 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3×29×83 Posts |
Hmm... Since I booted back to Windows to get back to the really fast rates, suddenly Windows only gets 75-80M/s... I didn't change anything!!!! What the hell?!?!? :(
Also, in Windows, it increases Sieve Primes massively until the average wait is down to a decent number; that doesn't happen in Linux with 75-80M/s rate. So confused x( EDIT: Let me clarify. Currently, mfaktc-lin is running ~90M/s, Sieve Primes=5000 (adjust=1) and avg. wait is <100 us. Most recently in Windows, mfaktc was running 90M/s rate, 5000 SP, and 3000+ us avg. wait at start, which would shift to 75M/s, 15000+ SP, and 300-400 us avg wait after enough classes completed. I don't know why/how this behavior changed from the 170M/s +, as far as I know I didn't change anything. Imma boot to Linux now and run that command. On the other hand, I remember testing it, and it seemed pretty clear to me that it was (12)(34)(56)(78), even though MPrime detects (15)(26)(37)(48). Huh. This computer hates me. EDIT: Code:
apicid : 0 initial apicid : 0 apicid : 2 initial apicid : 2 apicid : 4 initial apicid : 4 apicid : 6 initial apicid : 6 apicid : 1 initial apicid : 1 apicid : 3 initial apicid : 3 apicid : 5 initial apicid : 5 apicid : 7 initial apicid : 7 Also also, each OS seems incapable of reading the checkpoint files of the other. Should I be converting those too? Last fiddled with by Dubslow on 2011-09-16 at 05:45 |
|
|
|
|
|
#1202 | |
|
"Oliver"
Mar 2005
Germany
11×101 Posts |
Hi,
Quote:
I might change this in the future but not now. Oliver |
|
|
|
|
|
|
#1203 | |
|
(loop (#_fork))
Feb 2006
Cambridge, England
191716 Posts |
Quote:
|
|
|
|
|
|
|
#1204 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
11100001101012 Posts |
Urrgggg....
Now Windows is back to 165-170M/s again and I still haven't changed anything... In fact, I haven't even shut it down in any way since the last time it was at 75M/s... Thanks fivemack, I'll look into modifying my MPrime settings. |
|
|
|
|
|
#1205 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3·29·83 Posts |
RRRRRRRAAAAAAAAAGGGGGGGGGEEEEEEEEEE
mfaktc was going at 165M/s. I closed and immediately reopened it, and then it went down to 70M/s again. WHAT CHANGEDa sdfnva'jub;gvsopofIZJX IOHNS io'j;JUB;;kbguxhdvc |
|
|
|
|
|
#1206 |
|
"Oliver"
Mar 2005
Germany
111110 Posts |
Dubslow: just to be sure
Did you check if you GPU runs in highest performance mode? On Windows you can use GPU-Z to check the actuall GPU speed (Tab: Sensors). On Linux the you can start nvidia-settings to view the actual performance mode. Oliver |
|
|
|
|
|
#1207 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3×29×83 Posts |
Pre compiled, 1.7. I couldn't compile anything in Windows for the life of me. I think I had 4 streams, SP=5000. Note: normally I keep SPAdjust=1, but when it's at 90M/s, the avg. wait skyrockets, so the SP goes up, which winds up reducing the avg. rate to 75M/s, so I decided to keep SP stuck to 5000 until I can get back up to previous level, i.e. 165-170M/s.
Code:
SievePrimes=5000 # Set this to 1 to enable automatically adjustments of SievePrimes during # runtime based on the "average wait times". SievePrimesAdjust=0 # Set the number of CUDA streams / data sets used by mfaktc. # NumStreams must be >= 1. In this case mfaktc can process one stream / # data set on the GPU while the GPU can preprocess the other one. When # NumStreams is >= 2 than the time needed to upload (CPU->GPU transfer) # the data sets can be hidden (if the hardware supports it (most GPUs are # supporting this)). # On Linux systems 2 or 3 seems a good numbers. There are comments that # Windows systems need a greater number of streams. # A greater number increases the memory consumed by mfaktc (host and GPU # memory). The current limit for the number of streams is 10! NumStreams=4 # Set the number of data sets which can be preprocessed on CPU. This allows # to tolerate more jitter on runtime of preprocessing and GPU stream # runtime. CPUStreams=4 # The GridSize affects the number of threads per grid. # Depending on the number of multiprocessors of your GPU, too, the # automatic parameter threads per grid is set to: # GridSize = 0: 65536 < threads per grid <= 131072 # GridSize = 1: 131072 < threads per grid <= 262144 # GridSize = 2: 262144 < threads per grid <= 524288 # GridSize = 3: 524288 < threads per grid <= 1048576 (default) # A smaller GridSize has more overhead than a bigger GridSize for long # running jobs. For really small jobs there can be a small benefit on # computation time if the GridSize is small. A smaller GridSize directly # reduces the runtime per kernel launch and might result in a better # interactivity one older GPUs. GridSize=3 # WorkFile: the name of the file which contains the factoring assignments. # e.g. # worktodo.ini (Prime95 v24 and earlier) # worktodo.txt (Prime95 v25 and newer) WorkFile=worktodo.txt # Checkpoints = 0: disable checkpoints # Checkpoints = 1: enable checkpoints # Checkpoints are needed for resume capability, after a class is finished a # checkpoint file is written. When mfaktc is interrupted during the run and # restarted later it will begin at the last processed class. Checkpoints=1 # Allow to split an assignment into multiple bit ranges. # 0 = disabled # 1 = enabled # Enabled Stages make only sense when StopAfterFactor is 1 or 2. Stages=0 # possible values for StopAfterFactor: # 0: Do not stop the current assignment after a factor was found. # 1: When a factor was found for the current assignment stop after the # current bitlevel. This makes only sense when Stages is enabled. # 2: When a factor was found for the current assignment stop after the # current class. StopAfterFactor=2 # possible values for PrintMode: # 0: print a new line for each finished class # 1: overwrite the current line (more compact output) PrintMode=1 # allow the CPU to sleep if nothing can be preprocessed? # 0: Do not sleep if the CPU must wait for the GPU # 1: The CPU can sleep for a short time if it has to wait for the GPU AllowSleep=0 I do know it's not related to GPU clock rate. (I use MSI Afterburner in Windows.) When it's at 165M/s, changing the clock rate directly affects the avg. rate. I just tested the 90M/s half rate now: I verified it was somewhat OC'ed. I killed mfaktc, reset the clock to stock, restarted mfaktc, got exactly the same avg. rate (90M/s), then upped the rate back to slight OC (with mfaktc running this time) and the avg. rate didn't budge. (I have previously tested that avg. rate changes with clock rate changes, while mfaktc is running.) This seems to indicate to me that it's something funky with the program. GPU load is 86-87%, same as it was when running at 165M/s (I got that reading sometime last week.) EDIT: Oh my god Windows/Linux I swear there's no extra lines in mfaktc.cfg or whatever it's called. lol, that's what you guys said :P:) Last fiddled with by Dubslow on 2011-09-17 at 09:13 |
|
|
|
|
|
#1208 |
|
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3·29·83 Posts |
Hmm, just rebooted, now it's back to 170M/s, though as I said a few posts back, it's not just the rebooting...
How can I restart the graphics driver without rebooting? Also, GPU-Z now shows 75-76% load, not the 85% I said earlier. |
|
|
|
|
|
#1209 | |
|
Oct 2010
19110 Posts |
Quote:
|
|
|
|
|
|
|
#1210 |
|
Mar 2010
3×137 Posts |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mfakto: an OpenCL program for Mersenne prefactoring | Bdot | GPU Computing | 1676 | 2021-06-30 21:23 |
| The P-1 factoring CUDA program | firejuggler | GPU Computing | 753 | 2020-12-12 18:07 |
| gr-mfaktc: a CUDA program for generalized repunits prefactoring | MrRepunit | GPU Computing | 32 | 2020-11-11 19:56 |
| mfaktc 0.21 - CUDA runtime wrong | keisentraut | Software | 2 | 2020-08-18 07:03 |
| World's second-dumbest CUDA program | fivemack | Programming | 112 | 2015-02-12 22:51 |