mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Msieve

Reply
 
Thread Tools
Old 2017-10-09, 03:00   #1
jacky
 
Sep 2017

910 Posts
Default Weird results of Msieve

I have tested to factor C100 on two machines with GPU:

USE_CUDA = True

Command:"..\factMsieve.py example.n"

1、CPU:i7 3.60GHz RAM:8G GTX 745 elapsed time 7hours

2、CPU:E5-2670 2.60GHz RAM:64G Tesla P100-PCIE-16GB elapsed time 16hours


The result confuesd me.It seems that GPU card doesn't work.Why Tesla is much slower?

And if only test the first step,the command "./msieve -np1 -nps "stage1norm=xeyy stage2norm=uevv x,y" -t 4 -s gpu0polyfile"got the same result of "./msieve -np1 -nps "stage1norm=xeyy stage2norm=uevv x,y"-g 0 -t 8 -s gpu0polyfile".It seems both GPU and multi threads do not work.How to choose the number of thread?(-t number)

Last fiddled with by jacky on 2017-10-09 at 03:01
jacky is offline   Reply With Quote
Old 2017-10-09, 03:04   #2
wombatman
I moo ablest echo power!
 
wombatman's Avatar
 
May 2013

29×61 Posts
Default

Quote:
Originally Posted by jacky View Post
I have tested to factor C100 on two machines with GPU:

USE_CUDA = True

Command:"..\factMsieve.py example.n"

1、CPU:i7 3.60GHz RAM:8G GTX 745 elapsed time 7hours

2、CPU:E5-2670 2.60GHz RAM:64G Tesla P100-PCIE-16GB elapsed time 16hours


The result confuesd me.It seems that GPU card doesn't work.Why Tesla is much slower?

And if only test the first step,the command "./msieve -np1 -nps "stage1norm=xeyy stage2norm=uevv x,y" -t 4 -s gpu0polyfile"got the same result of "./msieve -np1 -nps "stage1norm=xeyy stage2norm=uevv x,y"-g 0 -t 8 -s gpu0polyfile".It seems both GPU and multi threads do not work.How to choose the number of thread?(-t number)
Leave off "-nps". Only "-np1" step is GPU-enabled. The "-nps -npr" steps are CPU only and single-threaded only.
wombatman is online now   Reply With Quote
Old 2017-10-09, 03:21   #3
jacky
 
Sep 2017

32 Posts
Default

Quote:
Originally Posted by wombatman View Post
Leave off "-nps". Only "-np1" step is GPU-enabled. The "-nps -npr" steps are CPU only and single-threaded only.
Thanks.
For the first result,why Tesla is slower if using python to factor.Is it the same reason?
jacky is offline   Reply With Quote
Old 2017-10-09, 04:07   #4
wombatman
I moo ablest echo power!
 
wombatman's Avatar
 
May 2013

29·61 Posts
Default

Quote:
Originally Posted by jacky View Post
Thanks.
For the first result,why Tesla is slower if using python to factor.Is it the same reason?
You won't be able to tell if the Tesla is actually slower until you run the "-np1" step by itself. If a given CPU is slower, then running the CPU-bound -nps and -npr steps would make the overall process slower.
wombatman is online now   Reply With Quote
Old 2017-10-09, 04:18   #5
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

4,861 Posts
Default

I don't think a GPU is any faster than CPU for a 100-digit input. That job is so small that invoking the GPU at all is a waste of effort; you may not be able to tell which is faster when the GPU works for something like 2 minutes.

GPU-enhanced poly select doesn't become notably more powerful than CPU until 140 digits or so.
VBCurtis is offline   Reply With Quote
Old 2017-10-09, 06:57   #6
jacky
 
Sep 2017

32 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
I don't think a GPU is any faster than CPU for a 100-digit input. That job is so small that invoking the GPU at all is a waste of effort; you may not be able to tell which is faster when the GPU works for something like 2 minutes.

GPU-enhanced poly select doesn't become notably more powerful than CPU until 140 digits or so.
Thanks.I seem to have found the problem according to your answer.
Now I am testing a 155-digit number for np1 on two machines with the same command"msieve.exe -np1 -g 0 -t 4 -s gpu0polyfile".I want to know if GPU works.
Is it right? How many hours it usually takes for np1?
jacky is offline   Reply With Quote
Old 2017-10-09, 07:27   #7
jacky
 
Sep 2017

32 Posts
Default

Now Tesla 100 is 4 times faster than GTX 745.It seems not to have achieved the desired result.I think Tesla should be much faster.Should I give more threads?
jacky is offline   Reply With Quote
Old 2017-10-09, 16:08   #8
chris2be8
 
chris2be8's Avatar
 
Sep 2009

1000000111102 Posts
Default

Here is the log from my run to generate a poly for a C167, which took 7:19:19. Note I only ran -npr on the best 200 lines output by -np1 -nps. Limiting the range of leading coefficients to search would speed it up.

I've also attached the perl script I ran to generate it. It's designed to work on Linux and requires UNIX utilities sort and tail. But it might work on a UNIX like environment under Windows. You would need to update the paths to resources in it, particularly GPUSORT and GPUPTX.

I hope you find it useful.

Chris
Attached Files
File Type: log b94+122.log (6.2 KB, 99 views)
File Type: txt mkpolys1.pl.txt (98.5 KB, 908 views)
chris2be8 is offline   Reply With Quote
Old 2017-10-10, 01:41   #9
jacky
 
Sep 2017

910 Posts
Default

Quote:
Originally Posted by chris2be8 View Post
Here is the log from my run to generate a poly for a C167, which took 7:19:19. Note I only ran -npr on the best 200 lines output by -np1 -nps. Limiting the range of leading coefficients to search would speed it up.

I've also attached the perl script I ran to generate it. It's designed to work on Linux and requires UNIX utilities sort and tail. But it might work on a UNIX like environment under Windows. You would need to update the paths to resources in it, particularly GPUSORT and GPUPTX.

I hope you find it useful.

Chris
Thank you very much.It's very useful to me.But I have a question,if limiting the range of leading coefficients to search,will it lead to failure in polynomial selection ?
jacky is offline   Reply With Quote
Old 2017-10-10, 05:47   #10
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

486110 Posts
Default

Quote:
Originally Posted by jacky View Post
Thank you very much.It's very useful to me.But I have a question,if limiting the range of leading coefficients to search,will it lead to failure in polynomial selection ?
Nope.
VBCurtis is offline   Reply With Quote
Old 2017-10-10, 16:09   #11
chris2be8
 
chris2be8's Avatar
 
Sep 2009

2×1,039 Posts
Default

That run searched for leading coefficients between 1 and 8000. Look at the LCmin and LCmax variables in the script. Or put them into the .n file as follows:
Code:
lcmin: 1
lcmax: 8000
LCstep will be ignored in a GPU run, it's only useful if you have several systems searching for a poly by CPU.

The time taken to search varies with the leading coefficient. Searching from 8000 to 16000 should take less time that 1 to 8000. So you can search a wider range if the coefficients are larger.

What range of leading coefficients to search for a given size of number is another issue. I've not had enough experience to offer advice.

Chris
chris2be8 is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Weird Dubslow YAFU 14 2016-01-06 19:34
This is weird... Isn't it? guido72 PrimeNet 18 2015-06-11 16:18
Weird factors rekcahx Miscellaneous Math 3 2011-11-01 23:25
weird abbreviations science_man_88 Lounge 35 2010-11-28 04:56
something very weird ixfd64 PrimeNet 1 2008-10-16 18:19

All times are UTC. The time now is 01:13.


Sat Jul 17 01:13:42 UTC 2021 up 49 days, 23 hrs, 1 user, load averages: 0.59, 0.97, 1.25

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.