![]() |
|
|
#1 |
|
Sep 2017
10012 Posts |
I have tested to factor C100 on two machines with GPU:
USE_CUDA = True Command:"..\factMsieve.py example.n" 1、CPU:i7 3.60GHz RAM:8G GTX 745 elapsed time 7hours 2、CPU:E5-2670 2.60GHz RAM:64G Tesla P100-PCIE-16GB elapsed time 16hours The result confuesd me.It seems that GPU card doesn't work.Why Tesla is much slower? And if only test the first step,the command "./msieve -np1 -nps "stage1norm=xeyy stage2norm=uevv x,y" -t 4 -s gpu0polyfile"got the same result of "./msieve -np1 -nps "stage1norm=xeyy stage2norm=uevv x,y"-g 0 -t 8 -s gpu0polyfile".It seems both GPU and multi threads do not work.How to choose the number of thread?(-t number) Last fiddled with by jacky on 2017-10-09 at 03:01 |
|
|
|
|
|
#2 | |
|
I moo ablest echo power!
May 2013
29·61 Posts |
Quote:
|
|
|
|
|
|
|
#3 |
|
Sep 2017
32 Posts |
|
|
|
|
|
|
#4 |
|
I moo ablest echo power!
May 2013
176910 Posts |
You won't be able to tell if the Tesla is actually slower until you run the "-np1" step by itself. If a given CPU is slower, then running the CPU-bound -nps and -npr steps would make the overall process slower.
|
|
|
|
|
|
#5 |
|
"Curtis"
Feb 2005
Riverside, CA
4,861 Posts |
I don't think a GPU is any faster than CPU for a 100-digit input. That job is so small that invoking the GPU at all is a waste of effort; you may not be able to tell which is faster when the GPU works for something like 2 minutes.
GPU-enhanced poly select doesn't become notably more powerful than CPU until 140 digits or so. |
|
|
|
|
|
#6 | |
|
Sep 2017
32 Posts |
Quote:
Now I am testing a 155-digit number for np1 on two machines with the same command"msieve.exe -np1 -g 0 -t 4 -s gpu0polyfile".I want to know if GPU works. Is it right? How many hours it usually takes for np1? |
|
|
|
|
|
|
#7 |
|
Sep 2017
32 Posts |
Now Tesla 100 is 4 times faster than GTX 745.It seems not to have achieved the desired result.I think Tesla should be much faster.Should I give more threads?
|
|
|
|
|
|
#8 |
|
Sep 2009
2×1,039 Posts |
Here is the log from my run to generate a poly for a C167, which took 7:19:19. Note I only ran -npr on the best 200 lines output by -np1 -nps. Limiting the range of leading coefficients to search would speed it up.
I've also attached the perl script I ran to generate it. It's designed to work on Linux and requires UNIX utilities sort and tail. But it might work on a UNIX like environment under Windows. You would need to update the paths to resources in it, particularly GPUSORT and GPUPTX. I hope you find it useful. Chris |
|
|
|
|
|
#9 | |
|
Sep 2017
32 Posts |
Quote:
|
|
|
|
|
|
|
#10 |
|
"Curtis"
Feb 2005
Riverside, CA
4,861 Posts |
|
|
|
|
|
|
#11 |
|
Sep 2009
81E16 Posts |
That run searched for leading coefficients between 1 and 8000. Look at the LCmin and LCmax variables in the script. Or put them into the .n file as follows:
Code:
lcmin: 1 lcmax: 8000 The time taken to search varies with the leading coefficient. Searching from 8000 to 16000 should take less time that 1 to 8000. So you can search a wider range if the coefficients are larger. What range of leading coefficients to search for a given size of number is another issue. I've not had enough experience to offer advice. Chris |
|
|
|
![]() |
| Thread Tools | |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Weird | Dubslow | YAFU | 14 | 2016-01-06 19:34 |
| This is weird... Isn't it? | guido72 | PrimeNet | 18 | 2015-06-11 16:18 |
| Weird factors | rekcahx | Miscellaneous Math | 3 | 2011-11-01 23:25 |
| weird abbreviations | science_man_88 | Lounge | 35 | 2010-11-28 04:56 |
| something very weird | ixfd64 | PrimeNet | 1 | 2008-10-16 18:19 |