mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Cunningham Tables (https://www.mersenneforum.org/forumdisplay.php?f=51)
-   -   Planning & Coordination for 2,2330L (https://www.mersenneforum.org/showthread.php?t=24292)

pinhodecarlos 2019-06-29 09:26

[QUOTE=swellman;520139]I too cried Havoc!, and let slip the dogs of war. Or something. Despite its name, my machine DESKTOP-C5KKONV is also a laptop. Just chewing up WUs and spitting out relations!

It’s fun to contribute, even if it’s only a very small part. The cloudygo site keeps the effort fresh - thanks SethTro. I hope Vebis and buster return soon.[/QUOTE]

I’m running for the 1 CPU years badge.

SethTro 2019-07-02 04:56

[QUOTE=swellman;520139]I too cried Havoc!, and let slip the dogs of war. Or something. Despite its name, my machine DESKTOP-C5KKONV is also a laptop. Just chewing up WUs and spitting out relations!

It’s fun to contribute, even if it’s only a very small part. The cloudygo site keeps the effort fresh - thanks SethTro. I hope Vebis and buster return soon.[/QUOTE]

I'm glad you like it! Let me know how it can be improved I'd love to do some more things.

Maybe a 'Primes' badge for turning in a WU with a X-prime number of relations

SethTro 2019-07-02 05:25

[QUOTE=fivemack;519874]Median runtimes in various configurations, on the same hardware (fortunately I have three identical computers)

[code]
One job -t32 1090s = 2180s for two
Two jobs -t8 2132s/2
Two jobs -t16 taskset 0-15; 16-31 1742s/2
Two jobs -t16 taskset 0-7,16-23; 8-15,24-31 1915s/2
[/code]

So, on these dual-socket eight-core machines, the right answer is to run two jobs, one across both sockets and the other on the other hyperthread across both sockets ; I think I'd expected two jobs to be better than one but am a bit surprised that having both jobs use both sockets is significantly better.[/QUOTE]

I'm curious what would be the best way for me to test this with only one machine.
I managed to get timestamps by adding [CODE]|& awk '{ print strftime("%Y-%m-%d-%H:%M:%S || ", systime()), $0 }'
[/CODE] to the end of my commands

Would I just run each configuration for a couple of hours and see how long between WUs?

---

2nd question: Is there a way to gracefully end a task? Other programs I've used all you to use ctrl+X to "quit after WU is finished". I'm turn my server off occasionally and I don't like killing 8 tasks (or ~4 amortized WU) each time I do that.

3rd question: If a WU completed but server upload failed is there an easy way to upload those results? I have about 10 WU's in this state which isn't a lot but annoys me

VBCurtis 2019-07-02 06:12

I know of no way to submit WUs outside of the client python; but you might have a look at the python code itself to see how submissions are handled to try to manually submit one. Be warned that WUs are reissued after a few hours, so a WU that's more than, say, 8 hours stale has already been issued and run by someone else. So, it may not be worth your time to learn how to submit a stale one.

I've been playing with # of threads and taskset myself, using your "relations in last 24 hr" as my metric. Seems easier than a timestamp, especially since each WU finds a variable number of relations so you'd have to divide relations by time every time anyway. I'm getting more production with 9 threads on a 6-core i7-5820k than I did with 6 threads, by about 10%; this blows a hole in my idea that exceeding 8 threads on one client has diminishing returns. Haven't tried 10-12 threads yet; I resumed the other tasks that also run on this machine instead.
Edit: note that relations get marginally more difficult to find as Q increases, so your production rate will drift lower over the course of weeks. If you do such tests, do them consecutive days.

remdups update: Q from 8-100M 545M unique, 234M duplicate, 779M total. Still zero bad relations. Duplicate rate now 30% overall, not great but yield (relations divided by Q-range) is still better than expected. We may have to bump the relations target up a bit if the duplicate rate continues to worsen (it will).

DukeBG 2019-07-02 07:45

[QUOTE=fivemack;519874]Median runtimes in various configurations, on the same hardware (fortunately I have three identical computers)

[code]
One job -t32 1090s = 2180s for two
Two jobs -t8 2132s/2
Two jobs -t16 taskset 0-15; 16-31 1742s/2
Two jobs -t16 taskset 0-7,16-23; 8-15,24-31 1915s/2
[/code]

So, on these dual-socket eight-core machines, the right answer is to run two jobs, one across both sockets and the other on the other hyperthread across both sockets ; I think I'd expected two jobs to be better than one but am a bit surprised that having both jobs use both sockets is significantly better.[/QUOTE]

Are you sure you're correctly specifying which cpu cores are physical and which HT? I'm more used to HT being the odd numbered and real being even numbered.

lukerichards 2019-07-04 05:19

[code]
z600 lucky 26693 weeks 6 CPU-years 1 2457 37659341 (4.2% total) 388.5 1.122 829670 2019-07-03 21:48:07,117
lukerichards-<COMP> unlucky 4411 weeks 5 2719 37466196 (4.2% total) 252.1 1.72 1026538 2019-07-03 21:45:26,759
[/code]


Note the race for 7th place is hotting up.

fivemack 2019-07-04 21:45

[QUOTE=DukeBG;520519]Are you sure you're correctly specifying which cpu cores are physical and which HT? I'm more used to HT being the odd numbered and real being even numbered.[/QUOTE]

Yes I am; this is a Linux machine (two sockets, 10 cores per socket)

scole 2019-07-04 22:48

I'm not familiar with the taskset command and the arguments. I've only set the CPU affinity on windows systems but assuming you want to not use hyperthreading, you need to set the affinity to only use physical CPUs (which I also thought were the even number CPUS) on a 2 CPU, 10 core/20 thread per CPU system, 20 physical CPUs, 40 logical CPUs to linux, wouldn't it be best to run 8 threads per task, each on physical CPUs, ie 2x 8 thread tasks, one on each physical CPU socket , wouldn't you use this?

taskset 0,2,4,6,8,10,12,14 (8 physical cores for task 1 on CPU 0)
taskset 20,22,24,26,28,30,32,34 (8 physical cores for task 2 on CPU 1)

That way memory use is isolated to the DIMM banks wired to the CPU and not have to go across the bridge?

VBCurtis 2019-07-05 03:34

Every Linux install I've used (admittedly, 90% ubuntu) has cores numbered the way fivemack explained.

Also, CADO responds well to HT use; using 20 threads for CADO (on a couple of clients) is 20-25% faster than 10 threads on a 10-core machine. My dual 10-core is using 4 5-threaded clients, with the other socket solving a large matrix.

Mumps 2019-07-06 13:55

[QUOTE=scole;520755]I'm not familiar with the taskset command and the arguments. I've only set the CPU affinity on windows systems but assuming you want to not use hyperthreading, you need to set the affinity to only use physical CPUs (which I also thought were the even number CPUS) on a 2 CPU, 10 core/20 thread per CPU system, 20 physical CPUs, 40 logical CPUs to linux, wouldn't it be best to run 8 threads per task, each on physical CPUs, ie 2x 8 thread tasks, one on each physical CPU socket , wouldn't you use this?

taskset 0,2,4,6,8,10,12,14 (8 physical cores for task 1 on CPU 0)
taskset 20,22,24,26,28,30,32,34 (8 physical cores for task 2 on CPU 1)

That way memory use is isolated to the DIMM banks wired to the CPU and not have to go across the bridge?[/QUOTE]
On Ubuntu/MINT, you can use /sys/devices/system/cpu to verify your system topology. Each thread will have a folder in there and within that will be a folder named topology.


/sys/devices/system/cpu/cpu0/topology $ grep "^" *

core_id:0
core_siblings:0003ff,f0003fff
core_siblings_list:0-13,28-41
physical_package_id:0
thread_siblings:000000,10000001
thread_siblings_list:0,28
So, my Dual E5-2690 V4, with 56 threads, reports cpu0 is a thread sibling with cpu28 whereas
/sys/devices/system/cpu/cpu27/topology $ grep "^" *

core_id:14
core_siblings:fffc00,0fffc000
core_siblings_list:14-27,42-55
physical_package_id:1
thread_siblings:8000000,08000000
thread_siblings_list:27,55

VBCurtis 2019-07-06 22:43

[QUOTE=VBCurtis;520514]remdups update: Q from 8-100M 545M unique, 234M duplicate, 779M total. Still zero bad relations.[/QUOTE]

Q from 8-120M 629M unique. I had a few (maybe ten) workunits in the 104M range that gzip puked on with "unexpected end of file". Those are excluded from the count, pending further investigation / repairing the files. If there's a simple repair-command, please suggest it.


All times are UTC. The time now is 22:25.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.