![]() |
|
|
#89 | |
|
Sep 2002
Database er0rr
5·937 Posts |
Quote:
Last fiddled with by paulunderwood on 2019-06-21 at 20:39 |
|
|
|
|
|
|
#90 |
|
∂2ω=0
Sep 2002
República de California
22·2,939 Posts |
Hmm ... 470% sounds great, but what do the actual run timings show? In my experience, running one job across both CPUs is worse than just using the a73 CPU.
|
|
|
|
|
|
#91 |
|
Sep 2002
Database er0rr
5·937 Posts |
136.4894ms, but that was not idle -- I reverted it back to -cpu 2:5 (with the correct configuration file). The CPUs dropped 3C each during the 6 core run. I am hoping for a great run-to-run value.
|
|
|
|
|
|
#92 |
|
Jan 2008
France
3·199 Posts |
I don't know where to post so here we go
![]() I ran mlucas_v19 (posted on this thread IIRC) on a board with a Qualcomm SD845 (Cortex-A75) https://www.96boards.org/product/rb3-platform/ There's no heatsink and no fan so obviously the CPU throttled during the run. setaffinity was failing; it works for 0:3 but these are the little CPU; 4:7 fails for some reason. I'll take a look at that. Code:
./mlucas -s m -iters 100 -cpu 4:7
18.0
2048 msec/iter = 44.03 ROE[avg,max] = [0.255691964, 0.312500000] radices = 128 16 16 32 0 0 0 0 0 0
2304 msec/iter = 54.09 ROE[avg,max] = [0.247767857, 0.312500000] radices = 288 16 16 16 0 0 0 0 0 0
2560 msec/iter = 55.49 ROE[avg,max] = [0.236635045, 0.281250000] radices = 160 16 16 32 0 0 0 0 0 0
2816 msec/iter = 65.30 ROE[avg,max] = [0.223967634, 0.250000000] radices = 44 32 32 32 0 0 0 0 0 0
3072 msec/iter = 67.83 ROE[avg,max] = [0.270591518, 0.312500000] radices = 192 16 16 32 0 0 0 0 0 0
3328 msec/iter = 74.63 ROE[avg,max] = [0.224553571, 0.281250000] radices = 208 8 8 8 16 0 0 0 0 0
3584 msec/iter = 80.38 ROE[avg,max] = [0.273772321, 0.312500000] radices = 224 16 16 32 0 0 0 0 0 0
3840 msec/iter = 83.62 ROE[avg,max] = [0.249135045, 0.312500000] radices = 240 16 16 32 0 0 0 0 0 0
4096 msec/iter = 91.08 ROE[avg,max] = [0.252901786, 0.281250000] radices = 128 16 32 32 0 0 0 0 0 0
4608 msec/iter = 101.29 ROE[avg,max] = [0.248046875, 0.312500000] radices = 288 32 16 16 0 0 0 0 0 0
5120 msec/iter = 106.82 ROE[avg,max] = [0.235030692, 0.281250000] radices = 160 8 8 16 16 0 0 0 0 0
5632 msec/iter = 129.85 ROE[avg,max] = [0.223102679, 0.250000000] radices = 352 16 16 32 0 0 0 0 0 0
6144 msec/iter = 138.96 ROE[avg,max] = [0.222753906, 0.281250000] radices = 768 16 16 16 0 0 0 0 0 0
6656 msec/iter = 149.21 ROE[avg,max] = [0.271651786, 0.312500000] radices = 208 8 8 16 16 0 0 0 0 0
7168 msec/iter = 146.54 ROE[avg,max] = [0.242801339, 0.312500000] radices = 224 16 32 32 0 0 0 0 0 0
7680 msec/iter = 150.23 ROE[avg,max] = [0.243743025, 0.312500000] radices = 240 16 32 32 0 0 0 0 0 0
|
|
|
|
|
|
#93 | |
|
∂2ω=0
Sep 2002
República de California
22·2,939 Posts |
Quote:
If you have any way to get some decent airflow over the CPU during your tests, even if it's not a practical one for long-term running, that might be useful, as well as monitoring the temperature data in /sys/class/thermal/thermal_zone*/temp during the run, if those files exist on your system. Oh, what does the cfg-file for -cpu 0:3 look like? Last fiddled with by ewmayer on 2019-07-03 at 19:12 |
|
|
|
|
|
|
#94 | ||
|
Jan 2008
France
59710 Posts |
I get messages like this during the run:
Code:
sched_setaffinity: Invalid argument I will see if I can fix this the issue. Quote:
Quote:
Code:
2048 msec/iter = 71.01 ROE[avg,max] = [0.238755580, 0.312500000] radices = 256 16 16 16 0 0 0 0 0 0
2304 msec/iter = 79.91 ROE[avg,max] = [0.247767857, 0.312500000] radices = 288 16 16 16 0 0 0 0 0 0
2560 msec/iter = 87.87 ROE[avg,max] = [0.236635045, 0.281250000] radices = 160 16 16 32 0 0 0 0 0 0
2816 msec/iter = 98.75 ROE[avg,max] = [0.270312500, 0.375000000] radices = 176 16 16 32 0 0 0 0 0 0
3072 msec/iter = 106.91 ROE[avg,max] = [0.270591518, 0.312500000] radices = 192 16 16 32 0 0 0 0 0 0
3328 msec/iter = 116.82 ROE[avg,max] = [0.252232143, 0.312500000] radices = 208 16 16 32 0 0 0 0 0 0
3584 msec/iter = 123.03 ROE[avg,max] = [0.273772321, 0.312500000] radices = 224 16 16 32 0 0 0 0 0 0
3840 msec/iter = 133.95 ROE[avg,max] = [0.249135045, 0.312500000] radices = 240 16 16 32 0 0 0 0 0 0
4096 msec/iter = 139.86 ROE[avg,max] = [0.227650670, 0.250000000] radices = 256 16 16 32 0 0 0 0 0 0
4608 msec/iter = 160.17 ROE[avg,max] = [0.250837054, 0.343750000] radices = 288 16 16 32 0 0 0 0 0 0
5120 msec/iter = 180.82 ROE[avg,max] = [0.296875000, 0.343750000] radices = 320 16 16 32 0 0 0 0 0 0
5632 msec/iter = 196.83 ROE[avg,max] = [0.223102679, 0.250000000] radices = 352 16 16 32 0 0 0 0 0 0
6144 msec/iter = 223.74 ROE[avg,max] = [0.253571429, 0.281250000] radices = 192 16 32 32 0 0 0 0 0 0
6656 msec/iter = 243.88 ROE[avg,max] = [0.232924107, 0.250000000] radices = 208 16 32 32 0 0 0 0 0 0
7168 msec/iter = 259.09 ROE[avg,max] = [0.242801339, 0.312500000] radices = 224 16 32 32 0 0 0 0 0 0
7680 msec/iter = 280.20 ROE[avg,max] = [0.243743025, 0.312500000] radices = 240 16 32 32 0 0 0 0 0 0
By the way results.txt contains these 3 lines: Code:
FATAL: iter = 14; nonzero exit carry in radix384_ditN_cy_dif1 - input wordsize may be too small. FATAL: iter = 14; nonzero exit carry in radix384_ditN_cy_dif1 - input wordsize may be too small. FATAL: iter = 12; nonzero exit carry in radix384_ditN_cy_dif1 - input wordsize may be too small. |
||
|
|
|
|
|
#95 | |
|
"Composite as Heck"
Oct 2017
2×52×19 Posts |
Quote:
Last fiddled with by M344587487 on 2019-07-04 at 10:49 Reason: Arrays start at 0 |
|
|
|
|
|
|
#96 | ||
|
Jan 2008
France
11258 Posts |
Quote:
Quote:
I set the governor to performance and got this: Code:
2048 msec/iter = 39.31 ROE[avg,max] = [0.255691964, 0.312500000] radices = 128 16 16 32 0 0 0 0 0 0
2304 msec/iter = 47.31 ROE[avg,max] = [0.228906250, 0.265625000] radices = 36 32 32 32 0 0 0 0 0 0
2560 msec/iter = 48.45 ROE[avg,max] = [0.236635045, 0.281250000] radices = 160 16 16 32 0 0 0 0 0 0
2816 msec/iter = 62.14 ROE[avg,max] = [0.243805804, 0.312500000] radices = 352 16 16 16 0 0 0 0 0 0
3072 msec/iter = 63.29 ROE[avg,max] = [0.217623465, 0.250000000] radices = 48 32 32 32 0 0 0 0 0 0
3328 msec/iter = 70.39 ROE[avg,max] = [0.219866071, 0.250000000] radices = 52 32 32 32 0 0 0 0 0 0
3584 msec/iter = 73.67 ROE[avg,max] = [0.213588170, 0.265625000] radices = 56 32 32 32 0 0 0 0 0 0
3840 msec/iter = 78.74 ROE[avg,max] = [0.249135045, 0.312500000] radices = 240 16 16 32 0 0 0 0 0 0
4096 msec/iter = 81.66 ROE[avg,max] = [0.252901786, 0.281250000] radices = 128 16 32 32 0 0 0 0 0 0
4608 msec/iter = 92.61 ROE[avg,max] = [0.299107143, 0.375000000] radices = 144 16 32 32 0 0 0 0 0 0
5120 msec/iter = 100.73 ROE[avg,max] = [0.234685407, 0.281250000] radices = 160 16 32 32 0 0 0 0 0 0
5632 msec/iter = 118.50 ROE[avg,max] = [0.246205357, 0.312500000] radices = 176 8 8 16 16 0 0 0 0 0
6144 msec/iter = 127.28 ROE[avg,max] = [0.253571429, 0.281250000] radices = 192 16 32 32 0 0 0 0 0 0
6656 msec/iter = 139.66 ROE[avg,max] = [0.271651786, 0.312500000] radices = 208 8 8 16 16 0 0 0 0 0
7168 msec/iter = 144.24 ROE[avg,max] = [0.242801339, 0.312500000] radices = 224 16 32 32 0 0 0 0 0 0
7680 msec/iter = 154.26 ROE[avg,max] = [0.243743025, 0.312500000] radices = 240 16 32 32 0 0 0 0 0 0
Code:
perfor ratio 2048 39.31 44.03 1.12 2304 47.31 54.09 1.14 2560 48.45 55.49 1.15 2816 62.14 65.3 1.05 3072 63.29 67.83 1.07 3328 70.39 74.63 1.06 3584 73.67 80.38 1.09 3840 78.74 83.62 1.06 4096 81.66 91.08 1.12 4608 92.61 101.29 1.09 5120 100.73 106.82 1.06 5632 118.5 129.85 1.10 6144 127.28 138.96 1.09 6656 139.66 149.21 1.07 7168 144.24 146.54 1.02 7680 154.26 150.23 0.97 I checked frequency a few times, and it always was 2.8 GHz on the fast chips and 1.8 on the slowest. Given the ratio above it's possible that part of the last two sizes were on the slower CPU. |
||
|
|
|
|
|
#97 |
|
Jan 2008
France
3·199 Posts |
Another run:
Code:
2048 msec/iter = 35.88 ROE[avg,max] = [0.250446429, 0.281250000] radices = 1024 32 32 0 0 0 0 0 0 0
2304 msec/iter = 42.26 ROE[avg,max] = [0.228906250, 0.265625000] radices = 36 32 32 32 0 0 0 0 0 0
2560 msec/iter = 45.14 ROE[avg,max] = [0.241992188, 0.281250000] radices = 40 32 32 32 0 0 0 0 0 0
2816 msec/iter = 54.73 ROE[avg,max] = [0.223967634, 0.250000000] radices = 44 32 32 32 0 0 0 0 0 0
3072 msec/iter = 55.49 ROE[avg,max] = [0.270591518, 0.312500000] radices = 192 16 16 32 0 0 0 0 0 0
3328 msec/iter = 66.12 ROE[avg,max] = [0.252232143, 0.312500000] radices = 208 16 16 32 0 0 0 0 0 0
3584 msec/iter = 70.29 ROE[avg,max] = [0.273772321, 0.312500000] radices = 224 16 16 32 0 0 0 0 0 0
3840 msec/iter = 74.62 ROE[avg,max] = [0.249135045, 0.312500000] radices = 240 16 16 32 0 0 0 0 0 0
4096 msec/iter = 80.18 ROE[avg,max] = [0.252901786, 0.281250000] radices = 128 16 32 32 0 0 0 0 0 0
4608 msec/iter = 92.21 ROE[avg,max] = [0.299107143, 0.375000000] radices = 144 16 32 32 0 0 0 0 0 0
5120 msec/iter = 100.66 ROE[avg,max] = [0.234685407, 0.281250000] radices = 160 16 32 32 0 0 0 0 0 0
5632 msec/iter = 115.37 ROE[avg,max] = [0.223102679, 0.250000000] radices = 352 16 16 32 0 0 0 0 0 0
6144 msec/iter = 127.71 ROE[avg,max] = [0.253571429, 0.281250000] radices = 192 16 32 32 0 0 0 0 0 0
6656 msec/iter = 140.97 ROE[avg,max] = [0.232924107, 0.250000000] radices = 208 16 32 32 0 0 0 0 0 0
7168 msec/iter = 146.96 ROE[avg,max] = [0.242801339, 0.312500000] radices = 224 16 32 32 0 0 0 0 0 0
7680 msec/iter = 156.33 ROE[avg,max] = [0.243743025, 0.312500000] radices = 240 16 32 32 0 0 0 0 0 0
I monitored frequency and temps every second and frequency didn't change. Code:
while true; do sleep 1; cat /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_cur_freq | tr '\012' ' '; cat /sys/class/thermal/thermal_zone*/temp 2> /dev/null | sort -n | uniq | tail -1 | tr '\012' ' '; date "+%H:%M:%S"; done
|
|
|
|
|
|
#98 | |
|
"Sam Laur"
Dec 2018
Turku, Finland
4758 Posts |
Quote:
|
|
|
|
|
|
|
#99 |
|
∂2ω=0
Sep 2002
República de California
267548 Posts |
Laurent, thanks for the data. Based on your timings, it seems the OS is doing a decent job of load-balancing even in the affinity-fail cases, though some of the large run-to-run timing variability you observe may be due to that.
The radix-384 errors you got are expected, that is a newly-added front-end radix in v19, there is a bug in multithreaded runs of it I've so far been unable to find. It failing the self-tests leaves you no worse off at FFT lengths of form 3*2^k that you would be using v18. Oh, would you be so kind as to post a copy of your /proc/cpuinfo file? Thanks. |
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| mprime on Odroid 64bit | ET_ | Software | 2 | 2017-02-24 15:42 |
| GPU72 plans post-announcement | garo | GPU to 72 | 25 | 2013-03-04 10:11 |
| The Prime Announcement Thread | axn | Sierpinski/Riesel Base 5 | 61 | 2008-12-08 16:28 |
| Subscribing to announcement thread | fetofs | GMP-ECM | 1 | 2006-05-30 04:32 |
| Fourth known factor of M(M31) (preliminary announcement) | ewmayer | Operazione Doppi Mersennes | 22 | 2005-07-06 00:33 |