![]() |
|
|
#23 |
|
Tribal Bullet
Oct 2004
3,541 Posts |
Don't ask for 32 threads out of Msieve. I would limit it to 2-3 threads per physical package on the machine you are using.
|
|
|
|
|
|
#24 |
|
"Nathan"
Jul 2008
Maryland, USA
111510 Posts |
The LL in question just finished. It has an error code of 08004400, which is 68 of those huge roundoff errors, and mprime catching its breath eight times long enough to say "hey, this isn't a hardware problem". Should I submit this result as a "Suspect LL" or should I send results.txt to you, so the error code can be adjusted?
|
|
|
|
|
|
#25 | |
|
"Nathan"
Jul 2008
Maryland, USA
5×223 Posts |
Quote:
I definitely understand that this is usually the case. But for some reason, and it may be related to my still not fully understanding how Linux pairs physical cores and logical "helper" hyperthreads, I get weird timings, e.g. for 50M exponents:
![]() I am curious as to what might be happening to cause the differential in iteration times between one set of eight cores and the other set of eight cores, though. As I said above, I agree 100% in all of my other experience, that maximal throughput is achieved by running one number on each core. But not so much in this case... Would it perhaps be better to run two copies of mprime, one on each CPU? Does mprime play nicely with multi-socket (as opposed to multi-core) systems? Last fiddled with by NBtarheel_33 on 2013-04-29 at 07:28 Reason: Remove redundant quote block |
|
|
|
|
|
|
#26 | |
|
Oct 2011
7·97 Posts |
Quote:
George's programs is so efficient that H/T can actually slow it down. Case in point: I have a 4 physical core laptop with H/T. These are the timings I get running ECM: 1 worker 1 core (affinity Logical cpu 1) = 774sec avg 1 worker 2 core (affinity Logical cpu 1,2)= 785 sec avg 1 worker 4 cores (affinity Logical cpu 1,2,3,4= 540 sec avg If I 'trick' the program by setting affinity to CPU 2, I get: 1 worker 2 cores (affinity logical cpu 2,3 [physical cores 1,2]) = 502 sec. avg As you can see, using both logical cores on a physical core is slower than using 1 in both instances. To truly see how your system works, I would start by setting it to run 16 workers on 1 core each. Then start 1 core and record timings, start a second core and record timings, continue untill you have all 16 running OR you see a significant slowdown in timings. My 2500 (it's not H/T) cannot run all 4 cores as efficiently as 3 due to a bottleneck somewhere, likely in memory bandwidth. 1 worker running = ~16.2ms/iter, 2 = ~16.4 avg, 3 = 17.2 avg, 4= 21.3 avg. Time to complete the equiv of 1 exponent: ~9.9 days on 1 core ~5.0 days on 2 cores ~3.4 days on 3 cores ~3.2 days on 4 cores |
|
|
|
|
|
|
#27 | |
|
P90 years forever!
Aug 2002
Yeehaw, FL
2·53·71 Posts |
Quote:
|
|
|
|
|
|
|
#28 |
|
"Nathan"
Jul 2008
Maryland, USA
5·223 Posts |
Another nice P-1, but no factor...
M93111047 completed P-1, B1=1080000, B2=33210000, E=12 Takes about 4 hours for Stage 1 and 8 hours for Stage 2, running on 8 cores/16 threads and 30GB RAM. Anyone ever seen a higher E? This is the third time I've had E=12... Last fiddled with by NBtarheel_33 on 2013-04-30 at 09:16 Reason: Add timing for Stage 1 and Stage 2 |
|
|
|
|
|
#29 | |
|
"Nathan"
Jul 2008
Maryland, USA
5·223 Posts |
Quote:
Done. The exponent is 50098369, if anyone's interested... Never mind, it's been assigned already! It's only been factored to 72, actually, so perhaps the GPUto7x folk will have a go at it. Incidentally, George, you factored it to 64 way back in 2008 (so let me say that it's an honor to have collaborated with you, sir ).In other happenings, this post is my nth, where n is the Number of the Beast...(say, wasn't there a devilish smiley at one time?) Last fiddled with by NBtarheel_33 on 2013-04-30 at 09:39 |
|
|
|
|
|
|
#30 |
|
Bamboozled!
"πΊππ·π·π"
May 2003
Down not across
2A0016 Posts |
|
|
|
|
|
|
#31 | |
|
"Mike"
Aug 2002
3×2,741 Posts |
Quote:
Code:
UID: Xyzzy/i7, M61192819 completed P-1, B1=580000, B2=11890000, E=12, We4: ΓΓΓΓΓΓΓΓ UID: Xyzzy/i7, M61478429 completed P-1, B1=585000, B2=11992500, E=6, We4: ΓΓΓΓΓΓΓΓ UID: Xyzzy/i7, M60505889 completed P-1, B1=570000, B2=11685000, E=12, We4: ΓΓΓΓΓΓΓΓ |
|
|
|
|
|
|
#32 |
|
Aug 2010
Kansas
547 Posts |
Xyzzy- could that be because of the size of the exponent? I noticed that only the largest one had a reduced E-value.
|
|
|
|
|
|
#33 |
|
"Nathan"
Jul 2008
Maryland, USA
21338 Posts |
Yeah, perhaps that's a breakpoint in the 61M range, between 8GB being strong enough to support E=12 vs. E=6. Maybe try another P-1 in the high 61M range with 12-16GB of RAM and see what happens.
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| New GPU Compute System | airsquirrels | GPU Computing | 90 | 2017-12-08 00:13 |
| Analog hardware to compute FFT's... | WraithX | Hardware | 1 | 2012-11-28 13:29 |
| Doubled compute power for a day? | Christenson | PrimeNet | 19 | 2011-10-26 08:29 |
| New Compute Box | Christenson | Hardware | 0 | 2011-01-15 04:44 |
| My throughput does not compute... | petrw1 | Hardware | 9 | 2007-08-13 14:38 |