mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Hardware (https://www.mersenneforum.org/forumdisplay.php?f=9)
-   -   i7 6700K 50% load with 4 workers (https://www.mersenneforum.org/showthread.php?t=21222)

halahup 2016-04-17 23:45

i7 6700K 50% load with 4 workers
 
My i7-6700K processor only works on 50% while using 4 workers, if I change the settings for 8 workers it loads 100% but it then separates every physical core into 2 logical ones. I wonder if there is a setting or a way to use 4 workers to load 100% just on those 4 physical cores? Please advise, thanks in advance!

ewmayer 2016-04-18 00:25

"50%" of what? If the OS calibrates things such that 100% represents all 8 logical cores fully busy, then 4-threaded will indeed appear as 50%. Above 50% the percentage will likely have very little to do with the total throughput in that case, though -- 100% means there are as many threads wanting to use a core as there are logical cores.

If you go to 8 workers, do you see 100% (or close to it?) And how do the runtimes compare to the 4-worker case? If 8-worker causes the per-iteration time to roughly double, that means "50%' does indeed mean "all 4 physical cores maxed out".

Assuming the above is the case, you should probably also try some 1-worker/4-threads timings - if the resulting (timing*4) is greater than the average per-worker timing with 4 single-threaded workers, that means the latter run mode maximizes your total throughput.

Mark Rose 2016-04-18 01:05

I'm not familiar with the Windows version, but what you probably want is processor affinity. I bet Windows is randomly running the four threads across the eight hyperthreads resulting in 50% usage across the board.

bgbeuning 2016-04-18 01:30

Short answer: prime95 is best using 4 workers on an i7 CPU.

Long answer: An i7 has 4 cores but 8 threads.
prime95 uses the floating point unit (FPU) a lot and the i7 has 1 FPU per core.
If you configure prime95 to use 8 workers on an i7 you will see the
time per iteration double because 2 workers are sharing one FPU.

halahup 2016-04-18 03:01

[QUOTE=ewmayer;431824]"50%" of what? If the OS calibrates things such that 100% represents all 8 logical cores fully busy, then 4-threaded will indeed appear as 50%. Above 50% the percentage will likely have very little to do with the total throughput in that case, though -- 100% means there are as many threads wanting to use a core as there are logical cores.

If you go to 8 workers, do you see 100% (or close to it?) And how do the runtimes compare to the 4-worker case? If 8-worker causes the per-iteration time to roughly double, that means "50%' does indeed mean "all 4 physical cores maxed out".

Assuming the above is the case, you should probably also try some 1-worker/4-threads timings - if the resulting (timing*4) is greater than the average per-worker timing with 4 single-threaded workers, that means the latter run mode maximizes your total throughput.[/QUOTE]

Yeah if I use 8 workers then task manager shows 100% load but timing increases drastically (ETA is almost same as my FX8370 processor), but if I use 4 workers, timing is faster but workload is 50%, also in the status field with 8 workers it says that prime95 is checking 16 exponents, while with 4 workers - 8 exponents.

halahup 2016-04-18 03:03

So would you suggest using just 4 workers even if it only loads 50% of CPU?

halahup 2016-04-18 03:09

1 Attachment(s)
I attached the screenshot with the task manager

axn 2016-04-18 03:21

[QUOTE=halahup;431831]So would you suggest using just 4 workers even if it only loads 50% of CPU?[/QUOTE]

Yes. The OS _thinks_ it is only using 50% of CPU, but it is in fact using 100% of the floating point execution units (which is what is relevant for LL testing). So you're getting the maximum out of your CPU with 4 workers.

halahup 2016-04-18 03:23

[QUOTE=axn;431834]Yes. The OS _thinks_ it is only using 50% of CPU, but it is in fact using 100% of the floating point execution units (which is what is relevant for LL testing). So you're getting the maximum out of your CPU with 4 workers.[/QUOTE]
Would you suggest using 2 threads per worker and 4 workers? Because then the system shows 100% work load.

axn 2016-04-18 03:24

[QUOTE=halahup;431835]Would you suggest using 2 threads per worker and 4 workers? Because then the system shows 100% work load.[/QUOTE]

It would show 100% load. It might even make it a _tiny_ bit faster (or not). But it will increase heat/power consumption, and will make the computer slightly less responsive. It isn't worth it.

halahup 2016-04-18 03:40

[QUOTE=axn;431836]It would show 100% load. It might even make it a _tiny_ bit faster (or not). But it will increase heat/power consumption, and will make the computer slightly less responsive. It isn't worth it.[/QUOTE]
Ok, yeah it works even faster with 4 workers and only 1 thread per worker (50% load) than 2 threads per worker (100% load). I guess the architecture is not as trivial as I expected it to be. Thank you everyone!


All times are UTC. The time now is 06:20.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.