![]() |
|
|
#1 |
|
(loop (#_fork))
Feb 2006
Cambridge, England
72·131 Posts |
This sounds as if it's time to redo my MPI+threads runs.
What I got with SVN900 was the rather uninspiring table below: Code:
1x8 grid, -t3 50757s 2x4 grid, -t2 49289s 2x4 grid, -t3 crash 4x2 grid, -t2 crash 4x2 grid, -t3 crash 8x1 grid, -t3 crash 3x8 grid, no threading 57624s (haswell -t4, svn 886 90103s) (haswell -t4, svn 900 ~50000s) |
|
|
|
|
|
#2 |
|
Jul 2003
So Cal
2·34·13 Posts |
Jason fixed the crash yesterday. When doing benchmarks, I don't let the run finish. About 10 minutes of running gives a stable ETA.
BTW, what's the size of this matrix? Last fiddled with by frmky on 2013-07-18 at 20:28 |
|
|
|
|
|
#3 | |
|
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2
36·13 Posts |
Tom with his SI units for time reminds me of this (patently silly) anecdote:
Quote:
|
|
|
|
|
|
|
#4 |
|
Jul 2003
So Cal
2×34×13 Posts |
BTW, I was just trying out the Stampede cluster yesterday on the 42.1M matrix from 3,745+. Each node of this cluster has two 8-core Sandy Bridge CPU's. Using 1024 cores in an 8x16 grid with 8 threads, which puts one MPI process on each CPU and uses threads to distribute the calculation on the CPU, msieve SVN 923 can complete the linear algebra in 22.5 hours.
A 42.1M matrix in under a day!
|
|
|
|
|
|
#5 |
|
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2
36×13 Posts |
Awesome!
gnfs-216-220 is in order then? 3,776+? 11,649L ? |
|
|
|
|
|
#6 |
|
Jul 2003
So Cal
2·34·13 Posts |
|
|
|
|
|
|
#7 |
|
Tribal Bullet
Oct 2004
67258 Posts |
The Lomonosov cluster needed 63 hours to solve a 40M matrix with 900 MPI processes. Two years later the solve time is 3x better!
Last fiddled with by jasonp on 2013-07-18 at 21:59 |
|
|
|
|
|
#8 | |
|
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2
36·13 Posts |
Quote:
I suspect that 11,323+ is a borderline SNFS, too; let's put some polynomials on the table to see... 3,766+ is a solid rung on the ladder... 3,718+ c222 seems to be a more convincing GNFS than 11,323+. |
|
|
|
|
|
|
#9 |
|
(loop (#_fork))
Feb 2006
Cambridge, England
641910 Posts |
The matrix I'm testing on is from 5+4.353, just because that's what I'd most recently sieved at the time the Haswell turned up; it's
Thu Jul 4 08:11:15 2013 matrix is 5382199 x 5382378 (1644.8 MB) with weight 515425593 (95.76/col) |
|
|
|
|
|
#10 |
|
(loop (#_fork))
Feb 2006
Cambridge, England
72·131 Posts |
This is 'how far did it get in twenty minutes', on a 1197160x1197390 matrix for a C127; each iteration is on average 63.22 dimensions
1/2/3/4 threads svn 886: 2584/3401/4720/5954 svn 923: 4700/8433/11535/14138 scaling to 1 thread svn 886: 1.00 / 1.32 / 1.83 / 2.30 svn 923: 1.00 / 1.79 / 2.45 / 3.01 scaling 923:886 at each thread count: 1.82 / 2.48 / 2.44 / 2.37 I have arranged an account on a friend's i7/3930 and will see what the times are like on that machine's faster memory subsystem tonight Last fiddled with by fivemack on 2013-07-19 at 15:04 |
|
|
|
|
|
#11 |
|
(loop (#_fork))
Feb 2006
Cambridge, England
72×131 Posts |
This is the 5382199 x 5382378 matrix mentioned earlier; figures are number of dimensions (not iterations) done in 3000 seconds, then expected kiloseconds for whole job. Machine is otherwise idle.
So at least 24 cores of Opteron manage to beat 4 cores of Haswell ... 1x8 -t2 299206 51.1 1x8 -t3 368336 41.6 2x4 -t2 349544 44.6 2x4 -t3 429151 36.0 4x2 -t2 344009 44.3 4x2 -t3 436101 35.8 8x1 -t2 292563 52.0 8x1 -t3 363180 42.9 Trying -t[456] tonight. Last fiddled with by fivemack on 2013-07-19 at 23:38 |
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| How to use more threads? | physicist | PrimeNet | 2 | 2018-01-10 17:07 |
| How to use all threads (i7-920) | petsoukos | Software | 2 | 2013-06-04 20:33 |
| Workers, Threads, Helper Threads, Cores, Affinity. | lorgix | Information & Answers | 12 | 2011-01-13 22:31 |
| P-1 with 2 threads? | lycorn | Software | 7 | 2009-09-13 03:19 |
| why less threads displayed | Zeta-Flux | Forum Feedback | 1 | 2006-12-01 18:01 |