mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Software (https://www.mersenneforum.org/forumdisplay.php?f=10)
-   -   Worker Threads - use all on same test (https://www.mersenneforum.org/showthread.php?t=20446)

Uncwilly 2015-08-29 02:22

[QUOTE=Madpoo;409089]Weird. I wonder if that could be from Photoshop setting a low/idle priority on some of it's threads. As if Adobe didn't consider that there could be something else running at idle priority and using lots of CPU.
...
If the Photoshop stuff is also running with priorities of 1, then "there's your problem". :smile:[/QUOTE]
I am pretty sure that is what is happening. I think that it is a habit of Adobe. Had an update for Acrobat Reader taking forever to dl across a network (using the upgrade app.), turned off P95 and it took off.

kladner 2015-08-29 06:18

[QUOTE=retina;409090]Another test is to turn off HT and see if the problems persists.[/QUOTE]

Ain't no HT on AMD, where I experience it, but I think the problem was/is agnostic. There are long-ago threads about it, but it would take some time to search them out.

LaurV 2015-08-29 07:46

Back on topic

@OP: You have nothing to worry, your benchmark tests show good times, suitable for your CPU and RAM, and they confirm what the people here told you. Take for example the FFT of 4096 (power of 2), this is suitable for exponents of 70-75M. You have to do 75M iterations for such an exponent, and using one thread on one physical core you will need 20.3ms*75M/1000/3600/24=~17-18 days. For this particular example of calculus, I will say you need 20 days for such a test, because it is easier for me to compute by transforming your milliseconds in days, without any calculus.

Then, you will use all cores to test 4 expos, you will still do this in 20 days. When you add the HT logical core into the fray, you will need 21 days to do the same test, or if you want to use two physical cores, you finish a test in 11 days, and you can do two of them, so you will do the 4 tests in 22 days. And so on. With 4 cores you can finish a test in 7 days, but to do 4 consecutive tests you will therefore need 28 days (25% wasted time!!). As long as your memory is good (and it seems to be, see the second part of the test), you should go for "4 workers, each using a single physical core, no HT, each worker testing its own exponent" (i.e. 4 assignments in the same time).

You can see some anomalies for the very small FFT, as that fits all in the cache and you don't do much data exchange, but generally, adding cores to the same task (either physical, or hyper threads) makes the things worse. Except when your memory can not provide the data for so many workers (and this you see in your benchmark for larger FFTs), then it is better you reduce the number of workers, and use more cores for the same worker.

But you are good enough with 4 workers, one core for each. Let it run. You lose about 12.5% of the time if you use 8 workers (i.e. HT).

For this particular FFT you still get a small memory penalty, as there is a no-so-much gain by going from 3 to 4 workers. This means that if you computer gets slower for some particular tasks you do for your daily work, then you can "safely" reduce P95 to 3 workers, let a core free for yourself, without affecting too much the LL work. This "effect" is felt less for lower FFTs (if you do DCs), and it gets more "bothering" for higher FFTs. This is normal. Nobody will dislike to have a faster memory, or more channels... :razz:

[QUOTE=Birddylicious;409084]
Prime95 64-bit version 28.5, RdtscTiming=1
Best time for 4096K FFT length: 19.419 ms., avg: 20.307 ms.
<snip>
Timing FFTs using 2 threads on 1 physical CPU.
Best time for 4096K FFT length: 20.231 ms., avg: 21.748 ms.
<snip>
Timing FFTs using 2 threads on 2 physical CPUs.
Best time for 4096K FFT length: 10.325 ms., avg: 11.399 ms.
<snip>
Timing FFTs using 3 threads on 3 physical CPUs.
Best time for 4096K FFT length: 7.861 ms., avg: 7.996 ms.
<snip>
Timing FFTs using 4 threads on 4 physical CPUs.
Best time for 4096K FFT length: 7.399 ms., avg: 7.526 ms.
<snip>
Timing FFTs using 8 threads on 4 physical CPUs.
Best time for 4096K FFT length: 7.724 ms., avg: 8.725 ms.
<snip>
Timings for 4096K FFT length (1 cpu, 1 worker): 19.84 ms. Throughput: 50.41 iter/sec.
Timings for 4096K FFT length (2 cpus, 2 workers): 21.47, 21.63 ms. Throughput: 92.80 iter/sec.
Timings for 4096K FFT length (3 cpus, 3 workers): 24.84, 24.18, 23.50 ms. Throughput: 124.16 iter/sec.
[COLOR=Red][B]Timings for 4096K FFT length (4 cpus, 4 workers): 30.70, 30.25, 29.78, 30.44 ms. Throughput: [SIZE=4]132.06[/SIZE] iter/sec.
[/B][/COLOR]Timings for 4096K FFT length (1 cpu hyperthreaded, 1 worker): 20.77 ms. Throughput: 48.15 iter/sec.
Timings for 4096K FFT length (2 cpus hyperthreaded, 2 workers): 22.04, 21.97 ms. Throughput: 90.88 iter/sec.
<snip>
Timings for 4096K FFT length (3 cpus hyperthreaded, 3 workers): 36.50, 27.63, 38.09 ms. Throughput: 89.84 iter/sec.
Timings for 4096K FFT length (4 cpus hyperthreaded, 4 workers): 41.66, 42.85, 39.97, 31.30 ms. Throughput: 104.30 iter/sec.
<snip>
[/QUOTE]

LaurV 2015-08-29 08:23

[QUOTE=Birddylicious;409086]
Set to run 27/7, with 29000mb of ram
8 Worker Windows
Recommendations?
[/QUOTE]
You don't need so much RAM for LL tests, the RAM is only needed if you do ECM or P-1 second stage work. LL tests can work well in few hundred megs of RAM (starting with 8-16 megs for DC tests).
You wanted to say 24/7, but we will be delighted if you work for the project 27 hours each day, anyhow.
Recommendation? Make it 4 workers. As said above, you lose about 12-13% of the time because of those 8 workers.
You may need to stop P95 (chose "exit" from the file menu - clicking the red X won't stop it), after you set it to 4 workers, then manually edit the worktodo.txt file (to avoid the "too many sections.." warning, and relocate the work from the last workers to the first. Move the exponents from the workers 5-8 to workers 1-4, delete the empty [worker x] sections). Restart p95.


All times are UTC. The time now is 01:12.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.