mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2012-12-26, 05:51   #606
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

21438 Posts
Default

Quote:
Originally Posted by Rodrigo View Post
Another question, about using mfakto on the GPU along with Prime95 on the CPU:

Is there any way to tell these two programs to use specific cores of the i7 3770? The reason is that yesterday I was using two CPU cores to do LLs while a third core was supporting mfakto, and everything was running smoothly. The time/class for mfakto was at 2.xxx seconds and one exponent was taking about 45 minutes to finish.

Then, I discovered a manually reserved LL that I had forgotten about for 174 days, so I decided to add it to Prime95 in a third worker window. But now, the time/class for mfakto is over 4.xxx seconds. (And yet, according to GPU-Z, the GPU load is at 1%. ) The per-iteration times for the original two Prime95 worker windows have gone up from 0.020 and 0.019 to 0.025 and 0.026, respectively. Evidently, mfakto and Prime95 are stepping on each other.

A further complication is that when I selected CPUs 2 and 4 for Prime95, according to Task Manager (Windows 7) there are 8 available threads (or whatever the right designation is for that) in the quad-core system, and it was the second and the fourth of these eight that were busy, so I am not sure if Prime95 was actually using the second and fourth cores, or merely the second halves of the first two cores. (Have I put my question clearly enough?) Now with the third worker operating and mfakto doing it thing, I've ended up with five of the eight threads running at or near 100%. I would have guessed three for each of the Prime95 workers, and one for mfakto (3 + 1 = 4, not 5).

FWIW, that third "emergency" LL is set to "Smart Assignment" CPU selection. When I had it selected to CPU 3, the per-iteration times on the other two shot up to 0.035 and 0.040. I ended up doing Smart Assignment, with ThreadsPerTest set at 2.

Bottom line: I would like to learn how to tell Prime95 to use (say) the first three physical cores (only), and mfakto to use the last core (only). And this while using just one thread, not two, per Prime95 worker.

Suggestions are very welcome...

Rodrigo
P95 has settings for the CPU cores in the 'Test' menu under 'Worker Windows...' option. As for mfakto, use the /affinity option from start in the cmd window like:
Code:
start /affinity 0x# "mfakto2" mfakto-win-64 -d 0
Replace the # with the code for the core, in your case use 8:
Code:
CPU3 CPU2 CPU1 CPU0 Bin Hex
 ================================
 OFF OFF OFF ON  = 0001 = 1
 OFF OFF ON  OFF = 0010 = 2
 OFF OFF ON  ON  = 0011 = 3
 OFF ON  OFF OFF = 0100 = 4
 OFF ON  OFF ON  = 0101 = 5 
 OFF ON  ON  OFF = 0110 = 6
 OFF ON  ON  ON  = 0111 = 7
 ON  OFF OFF OFF = 1000 = 8
 ON  OFF OFF ON  = 1001 = 9
 ON  OFF ON  OFF = 1010 = A 
 ON  OFF ON  ON  = 1011 = B
 ON  ON  OFF OFF = 1100 = C
 ON  ON  OFF ON  = 1101 = D
 ON  ON  ON  OFF = 1110 = E 
 ON  ON  ON  ON  = 1111 = F
flashjh is offline   Reply With Quote
Old 2012-12-26, 05:56   #607
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×29×83 Posts
Default

When you're running something else that's CPU intensive besides Prime95, it's generally best to assign cores manually. You can set the affinity of Prime95 through its GUI, as you know, and the easiest way to set the mfakto affinity is through the Task Manager. Right click on the process, and affinity or something like it should appear as one of the options on the menu. You would have to remember to do this every time you start mfakto. (If you restart it a lot or use multiple instances, it can be easier to use a batch file -- ask others, e.g. kladner, for assistance there.)

On Windows, threads 1 and 2 form one physical core, 3 and 4 do, etc. (On Linux, it's most likely that 1 and 5 are a pair, 2 and 6, etc)

Edit: ninja'd. flash explained how to set affinities from the command line/batch file.

Last fiddled with by Dubslow on 2012-12-26 at 05:57
Dubslow is offline   Reply With Quote
Old 2012-12-26, 18:37   #608
Rodrigo
 
Rodrigo's Avatar
 
Jun 2010
Pennsylvania

2×467 Posts
Default

Before trying the affinity settings for mfakto, I'm looking to set the best settings for Prime95.

Unfortunately, it looks like Prime95 isn't discovering the threads and cores correctly. When I set the three workers to operate off what Prime95 calls "CPUs" 1, 2, and 3, this is what I got after restarting (with ThreadsPerTest set to 2):

Quote:
Worker #1
Setting affinity to run worker on logical CPU #1
Setting affinity to run helper thread 1 on logical CPU #2

Worker #2
Setting affinity to run worker on logical CPU #2
Setting affinity to run helper thread 1 on logical CPU #3

Worker #3
Setting affinity to run worker on logical CPU #3
Setting affinity to run helper thread 1 on logical CPU #4
So the upshot is that Worker #2 is overlapping with Worker #1, and Worker #3 is overlapping Worker #2. Two of these threads are doing double duty. Meanwhile, according to Task Manager, the fifth through eighth threads (two full cores' worth) are sitting idle. For whatever reason, Prime95 seems to view (for example) the second thread of the second core, as the "fourth" CPU, and so the workers get assigned that way, incorrectly. It doesn't offer to do work on any "CPUs" 5 to 8.

I had tried doing this with ThreadsPerTest set to 1, but the LL per-iteration times were worse.

FWIW, this is Prime95 version 27.7, build 2.

Next I can try setting the Prime95 affinity in Task Manager as Dubslow suggested, but if Prime95 isn't finding the correct CPUs in the first place, I'm not sure what good it'll do. If I set its affinity to "CPUs" 0-5, I have no confidence that it will find and actually use 4 and 5, since they're not being used now.

This all might sound OT because we're in the mfakto thread, but again the idea is that I'm trying to get mfakto and Prime95 to each keep to their assigned cores...

Rodrigo

Last fiddled with by Rodrigo on 2012-12-26 at 18:41 Reason: additional info
Rodrigo is offline   Reply With Quote
Old 2012-12-26, 19:53   #609
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

You shouldn't assign one worker to core 1, and the next to core 2, since you yourself then are assigning two workers to one core (1&2 are one core, 3&4, are one core, etc.). Instead, assign worker 1 to CPU 1, assign worker 2 to CPU 3, assign worker 3 to CPU 5, and worker 4 to CPU 7. That way each worker is on a separate physical CPU. (When you add mfakto then, your timings will drop because mfakto will have to be on one of those cores.) How many instances of mfakto do you run?
Dubslow is offline   Reply With Quote
Old 2012-12-26, 20:44   #610
Rodrigo
 
Rodrigo's Avatar
 
Jun 2010
Pennsylvania

2×467 Posts
Default

Well, that's exactly the problem: in Prime95 there seems to be no way to assign worker 3 to CPU 5, since the choices (in Test --> Worker Windows) end at CPU 4.

Right now I'm not running any instances of mfakto, while this issue gets sorted out. Ideally, I'd like to run Prime95 on three cores (doesn't matter which three), and then use the last core to support mfakto.

BTW, since the previous post I set all three current workers to Smart Assignment, and the per-iteration times dropped precipitously to 0.019-0.022 seconds in all three cases. Now Task Manager shows that CPUs 0, 2, and 4 (that is to say, the first, third, and fifth threads) are taking on the bulk of the work, although the remaining five do have substantial loads as well (three of them as helper threads). But I have yet to find a way to set this setting manually, so that when mfakto gets thrown into the mix it doesn't interfere with Prime95.

Rodrigo
Rodrigo is offline   Reply With Quote
Old 2012-12-26, 20:51   #611
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Quote:
Originally Posted by Rodrigo View Post
Well, that's exactly the problem: in Prime95 there seems to be no way to assign worker 3 to CPU 5, since the choices (in Test --> Worker Windows) end at CPU 4.
That is incredibly bizarre.
Dubslow is offline   Reply With Quote
Old 2012-12-26, 20:53   #612
firejuggler
 
firejuggler's Avatar
 
Apr 2010
Over the rainbow

A2E16 Posts
Default

if i remember there is an, option, with like coretouse or scrambleCPU where you specify wwich cpu to use, right?
firejuggler is online now   Reply With Quote
Old 2012-12-26, 20:56   #613
Rodrigo
 
Rodrigo's Avatar
 
Jun 2010
Pennsylvania

2·467 Posts
Question How to report a factor?

An unrelated question:

This morning I manually submitted a bundle of 12 TF results completed by mfakto. One of them included a factor found, and that one doesn't seem to be getting through to the PrimeNet server.

When I submitted the results in a bunch, the server seemed to hang and eventually I got dumped to a mostly blank PrimeNet page that only had the sign-in blanks on the left, and nothing at all in the right panel. The same thing happened when I cut-and-pasted the two-line result:

Quote:
M77xxxxxx has a factor: 19xxxxxxxxxxxxxxxxxxxx [TF:70:71*:mfakto 0.12-Win barrett15_75]
found 1 factor for M77xxxxxx from 2^70 to 2^71 (partially tested) [mfakto 0.12-Win barrett15_75_2]
Next I tried submitting just the first of these two lines, and still PrimeNet doesn't seem to be aware of the report as it's not showing up in my Results Details page. (The other 11 are already there.)

What to do?

Rodrigo
Rodrigo is offline   Reply With Quote
Old 2012-12-26, 21:00   #614
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

722110 Posts
Default

Quote:
Originally Posted by Rodrigo View Post
An unrelated question:

This morning I manually submitted a bundle of 12 TF results completed by mfakto. One of them included a factor found, and that one doesn't seem to be getting through to the PrimeNet server.

When I submitted the results in a bunch, the server seemed to hang and eventually I got dumped to a mostly blank PrimeNet page that only had the sign-in blanks on the left, and nothing at all in the right panel. The same thing happened when I cut-and-pasted the two-line result:


Next I tried submitting just the first of these two lines, and still PrimeNet doesn't seem to be aware of the report as it's not showing up in my Results Details page. (The other 11 are already there.)

What to do?

Rodrigo
PrimeNet's having an "episode" right now.
Dubslow is offline   Reply With Quote
Old 2012-12-26, 21:02   #615
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

100110001001112 Posts
Default

Quote:
Originally Posted by Rodrigo View Post
What to do?
This is a known bug with the Primenet server -- memory starvation affecting the tool used to verify the factor.

Wait for George or Scott to reboot the server, or keep trying. Preferably not at the top of the hour -- wait until at least 15 minutes past.
chalsall is offline   Reply With Quote
Old 2012-12-26, 21:09   #616
Rodrigo
 
Rodrigo's Avatar
 
Jun 2010
Pennsylvania

2·467 Posts
Default

Quote:
Originally Posted by firejuggler View Post
if i remember there is an, option, with like coretouse or scrambleCPU where you specify wwich cpu to use, right?
Ah, your clue led me to this in UNDOC.TXT:

Quote:
The program makes its best guess at how the OS maps hyperthreaded logical CPU
numbers to physical CPUs. It also assigns workers and helper threads
to CPUs for optimal speed. However, bugs, new architectures, or situations we
haven't considered may make different affinity settings desirable. In
local.txt set
AffinityScramble2=string
Where the characters in "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz()"
represent 64 logical CPU numbers. For example, let's say you have a system
with 8 logical cores with 4 workers each using a helper thread. Also, assume
your system has logical CPUs 0 & 4 on the same physical CPU core, 1 & 5, etc.
If the program is properly determining which logical CPUs share the same physical
CPU, then the program internally generates an affinity scramble string of "04152637".
The program's default policy is to assign the worker and helper threads to the same
physical CPU. If the program is not properly determining which logical CPUs share the
same physical CPU, or you think a different affinity policy would result in better
performance, then set AffinityScramble2 accordingly. Let's say you think
running the helper threads on a different physical core would be better, then
you might set AffinityScramble2=02134657 to test out your theory.
So if Dubslow is right, then I should add an AffinityScramble line to LOCAL.TXT (does it matter where?), and make it 01234567. What do you think?

Rodrigo
Rodrigo is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
gpuOwL: an OpenCL program for Mersenne primality testing preda GpuOwl 2718 2021-07-06 18:30
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3497 2021-06-05 12:27
LL with OpenCL msft GPU Computing 433 2019-06-23 21:11
OpenCL for FPGAs TObject GPU Computing 2 2013-10-12 21:09
Program to TF Mersenne numbers with more than 1 sextillion digits? Stargate38 Factoring 24 2011-11-03 00:34

All times are UTC. The time now is 07:38.


Mon Aug 2 07:38:42 UTC 2021 up 10 days, 2:07, 0 users, load averages: 1.02, 1.32, 1.36

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.