mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Msieve

Reply
 
Thread Tools
Old 2020-03-10, 18:17   #12
pinhodecarlos
 
pinhodecarlos's Avatar
 
"Carlos Pinho"
Oct 2011
Milton Keynes, UK

477810 Posts
Default

HT helps a lot on LA, at least for me.
pinhodecarlos is online now   Reply With Quote
Old 2020-03-12, 23:36   #13
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

10001100100112 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
-nc1 was run with target-density 134. After remdups and adding freerels in, msieve states 99.3M unique relations. Matrix came out 4.57M dimensions. TD=140 did not complete filtering.
Machine: Xeon 2680v3 Haswell generation 12x2.5ghz, 48GB memory on 4 channels DDR4 (4x4GB+4x8GB).

VBITS=128 on otherwise idle machine. ETA after 1% of job:
6-threaded 14hr 34 min
12-threads 8 hr 26 min
18-threads 9 hr 15 min
24-threads 8 hr 27 min
These times look rather slow; I just installed the extra 32GB memory today, so perhaps filling all 8 slots slows memory access a bunch. Some time I'll remove the original 16GB and see if 4 sticks is faster than 8.
VBCurtis is offline   Reply With Quote
Old 2020-04-27, 12:16   #14
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

5×112×13 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
While unlikely, it is possible that 20 or 24 threads yields a bit of improvement. Hyperthreads don't always help on matrix solving, but since this is a benchmark thread it might be nice to demonstrate that.

I suggest 20 as alternative because using every possible HT might be impacted by any background process, but that effect should be reduced if we leave a few HTs 'open'. I've found situations where using N-1 cores runs faster than N cores, for what I presume are similar reasons.
We ran a 24 thread test last night. It was 1.09% faster than the 12 thread job. During the run, the CPU reported roughly 1700% utilization, so there must be a lot of overhead and/or bottlenecks. We are currently running a 20 thread test that we will post later.

Note that we only count the LA phase in our calculations.

Attached Files
File Type: log 24.log (11.5 KB, 77 views)
Xyzzy is offline   Reply With Quote
Old 2020-04-27, 22:12   #15
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

172718 Posts
Default

Quote:
Originally Posted by Xyzzy View Post
We are currently running a 20 thread test that we will post later.
The 20 thread run somehow ended up slower than the 12 thread run.

12 = 8h04m50s
20 = 8h32m31s
24 = 7h59m33s

Attached Files
File Type: log 20.log (10.5 KB, 84 views)
Xyzzy is offline   Reply With Quote
Old 2020-08-07, 16:31   #16
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

786510 Posts
Default

CPU = i7-8565U
RAM = 2×16GB DDR4-2400
CMD = ./msieve -v -nc -t 8
LA = 47884s


Attached Files
File Type: gz msieve.log.gz (2.7 KB, 57 views)
Xyzzy is offline   Reply With Quote
Old 2020-08-08, 17:00   #17
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

1EB916 Posts
Default

CPU = i7-8565U
RAM = 2×16GB DDR4-2400
CMD = ./msieve -v -nc -t 4
LA = 51662s


Attached Files
File Type: gz msieve.log.gz (2.8 KB, 56 views)
Xyzzy is offline   Reply With Quote
Old 2020-08-12, 17:59   #18
Xyzzy
 
Xyzzy's Avatar
 
"Mike"
Aug 2002

5×112×13 Posts
Default

CPU = 3950X
RAM = 2×8GB DDR4-3666
CMD = ./msieve -v -nc -t 16
LA = 27180s


Attached Files
File Type: gz msieve.log.gz (2.8 KB, 48 views)
Xyzzy is offline   Reply With Quote
Old 2020-08-14, 07:53   #19
kurtb
 
"Beschorner Kurt"
Jul 2016
Germany

238 Posts
Default

In my experience, the throughput depends on an additionally running program (e.g. gmp-ecm)

machine: i7-7820X - 8 cores + HT
matrix: 49M * 49M
memory: 64 GB

msieve .... -t16 solo ~ 55% (power according task manager)
msieve .... -t16 and gmp-ecm (prior: low) ~ 78% -"-

With msieve + mprime/Prime95 the effectiveness is a litle lower
Kurt
kurtb is offline   Reply With Quote
Old 2020-08-14, 18:06   #20
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

22·33·19 Posts
Default

Quote:
Originally Posted by bsquared View Post
Machine: 2 sockets of 20-core Cascade-Lake Xeon
Just used the default density.
matrix is 5149968 x 5150142 (1913.9 MB) with weight 597210677 (115.96/col)

Here is a basic 40 threaded job across both sockets (actually, I guess it is thread-limited to 32 threads):
4 hrs 58 min: /msieve -v -nc2 -t 40

Using MPI helps a lot. Here are various configurations using different VBITS settings (timings after 1% elasped):
I know this is late, but if you still have this data set up, try
mpirun -np 2 msieve -nc2 1,2 -v -t 20
frmky is offline   Reply With Quote
Old 2020-08-15, 07:23   #21
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

22·33·19 Posts
Default

Here's a bench using compute nodes with one Xeon E5-2650 v4 Broadwell cpu with 12-cores, 24 threads.

1 node 7h 40m
2 nodes 2h 45m
4 nodes 1h 35m
8 nodes 1h 10m

Not sure why the time for one node is so high compared to the others? Perhaps something fitting into the cache with the smaller matrices on each node?
frmky is offline   Reply With Quote
Old 2020-08-15, 15:58   #22
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

10001100100112 Posts
Default

Quote:
Originally Posted by frmky View Post
Here's a bench using compute nodes with one Xeon E5-2650 v4 Broadwell cpu with 12-cores, 24 threads.

1 node 7h 40m
2 nodes 2h 45m
4 nodes 1h 35m
8 nodes 1h 10m

Not sure why the time for one node is so high compared to the others? Perhaps something fitting into the cache with the smaller matrices on each node?
It never occured to me that MPI - 2 nodes would be more than twice as fast, under any test. Neat! Now, if only ubuntu would fix MPI....
VBCurtis is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
PFGW benchmarking carpetpool Hardware 4 2019-09-30 20:06
Looking for benchmarking help with a Phenom or PhenomII X6 mrolle Software 25 2012-03-14 14:15
GMP 5.0.1 vs GMP 4.1.4 benchmarking unconnected GMP-ECM 5 2011-04-03 16:16
Benchmarking dual-CPU machines garo Software 2 2010-09-27 20:33
Benchmarking challenge! Xyzzy Software 17 2003-08-26 15:43

All times are UTC. The time now is 16:48.

Wed Dec 2 16:48:44 UTC 2020 up 83 days, 13:59, 2 users, load averages: 2.56, 2.18, 1.89

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.