![]() |
[QUOTE=swellman;521536]Yoyo kindly killed the rest of the CN queue. I’ll update it tomorrow night with 2,2210M as the focus.[/QUOTE]
It seems that the CN and HC queues have not progressed at all in days. |
The server controls the inflow of numbers. Sometimes it can be like watching water boil.
|
I don't see many results on taskset with respect to CADO-NFS ([URL="https://www.mersenneforum.org/showthread.php?p=519874#post519874"]other than fivemack's[/URL]) so I'm going to try and duplicate fivemack's testing with a couple of extra dimensions.
my plan is to test for about 6 WU with each of these configs (I have 2x 2650v2 which is 16 cores, 32 threads with 64 gigs of ram which supports 5 WU at a time but not quite 6) Right now I'm confused how fivemack tested because taskset get's reset after each WU with cado-nfs-client.py Looking at [CODE]cat /proc/cpuinfo | grep -v 'flags\|bugs\|bogomips\|wp\|fpu\|family\|model\|MHz\|apicid\|cpuid\|cache_align\|address\|clflush\|vendor_id\|stepping\|microcode' # and lstopo[/CODE] [CODE] CPU 0-7 are hyperthreaded with CPU 16-23 and CPU 8-15 are hyperthreaded with CPU 24-31[/CODE] [CODE] 1x 4threads, no CPU affinity 1x 4threads, CPU 0-4 --- 1x 8threads, CPU 0-7 --- 1x 16threads, CPU 0-15 1x 16threads, CPU 0-7,16-23 --- 4x 4threads, no CPU affinity 4x 4threads, CPU 0-3,4-7,16-23 (16 threads on 8 cores) 4x 4threads, CPU 0-3,4-7,8-11,12-15 (16 threads on 16 cores) --- 4x 8threads, CPU 0-7,8-15,16-23,24-31 --- 2x 16threads, CPU 0-15,16-31 (both on all cores) 2x 16threads, CPU 0-7+16-23,8-15+24-31 (one job on one CPU, one job on the other CPU) [/CODE] Results will be updated in place over the next couple of days. |
Tasksets are one of the things a process inherits from the process that started it, so if you do
taskset -c 0-3 python cado-nfs-client.py {parameters} All the las jobs that it starts will use that taskset |
[QUOTE=fivemack;521716]Tasksets are one of the things a process inherits from the process that started it, so if you do
taskset -c 0-3 python cado-nfs-client.py {parameters} All the las jobs that it starts will use that taskset[/QUOTE] Hum this didn't seem to work before, maybe I wasn't setting -c correctly. Thanks for the answer [CODE] time taskset -c 0-4 python3 -c "import gmpy2; import multiprocessing as mp; p = mp.Pool(4); print (sum(p.map(gmpy2.is_prime, range(10**7))))" 664579 [/CODE] |
2 Attachment(s)
[QUOTE=fivemack;521716]Tasksets are one of the things a process inherits from the process that started it, so if you do
taskset -c 0-3 python cado-nfs-client.py {parameters} All the las jobs that it starts will use that taskset[/QUOTE] Actually I tried again on my main server and it didn't work again. [CODE] taskset -c 0-7 python cado-nfs-client.py --bindir=build/seven --server="<SERVER>" [/CODE] When I look at the process tree the first child inherits cpu affinity but build/seven/sieve/las was reset to all cpu affinity [CODE] htop 3265 | 'python cado-nfs-client...' 3266 |-- /bin/sh -c 'build/seven/sieve/las' ... 3267 |-- build/seven/sieve/las -I 16 ... pid 3265's current affinity mask: ff pid 3266's current affinity mask: ff pid 3267's current affinity mask: ffffffff [/CODE] A couple of places in the code (las-parallel,cpp, bind_threads.sh, cpubinding.cpp) all might affect this. I'll have to dig into that :/ |
[QUOTE=R.D. Silverman;521685]It seems that the CN and HC queues have not progressed at all in days.[/QUOTE]Happens all the time to everyone. Sometimes the GCW queue doesn't move for a week or more and then there is a flurry of activity.
No, I don't know why. |
[QUOTE=swellman;521687]The server controls the inflow of numbers. Sometimes it can be like watching water boil.[/QUOTE]
These are numbers already running. 2,2210M has been stuck at 8960 curves for days; It is not progressing at all. |
[QUOTE=R.D. Silverman;521731]These are numbers already running. 2,2210M has been stuck at 8960 curves for days; It is not progressing at all.[/QUOTE]
My guess: the server only allows for a limited number of outstanding tasks for any one number (or queue?) and won't allocate more until the pipeline drains. If this is the case, there must be a expiry limit on any allocated task beyond which the machine to which has been allocated is declared MIA and its allocation given to someone else. What we've been seeing is a blockage in the pipeline and the server hasn't yet started using [URL="https://en.wikipedia.org/wiki/Dyno-Rod"]Dyno-Rod[/URL]. |
[QUOTE=xilman;521736]My guess: the server only allows for a limited number of outstanding tasks for any one number (or queue?) and won't allocate more until the pipeline drains. If this is the case, there must be a expiry limit on any allocated task beyond which the machine to which has been allocated is declared MIA and its allocation given to someone else. What we've been seeing is a blockage in the pipeline and the server hasn't yet started using [URL="https://en.wikipedia.org/wiki/Dyno-Rod"]Dyno-Rod[/URL].[/QUOTE]
Certainly Plausible. But not the best allocation scheme IMO. |
The workunit processing is not fully under control by the server. There are also volunteers in the game. For CN, HC, GCW 5 curves are put into one boinc workunit and sent to a volunteer with a deadline of 5 days. If the volunteer doesn't return any result it takes 5 days until the server recognise it. After 5 days this workunit is sent to someone else which also might not return. So some curves just need time.
Those resent also happens if a workunit returns with an error. In this case it is also resent to someone else. Such resents doesn't lead to any progress. In the meantime the server sends out other ecm workunits from it's "unsent tasks" list. It sends always the oldest ones first. If the "unsent tasks" list drains below 1000 it generates new tasks from one of the ecm input queues. It chooses the input queue from the project which got least computing power in the last 5 days. |
| All times are UTC. The time now is 21:49. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.