![]() |
![]() |
#1 |
Dec 2015
10012 Posts |
![]()
Hello all,
Long time reader of the forum, first time poster. Is it possible to use Msieve with Cuda and a cluster? I.E. only the head node will have Cuda and the workers not have Cuda? It is to my understanding that Cuda is only used in one phase, therefor it would not be needed in the worker nodes. Also, does anyone have any first hand experience with Msieve and MPI? I understand how to set it up (I think), just curious of the speedups that have been seen with large factors. How large of a cluster would one need to factor big values (1024 and above)? Thanks in advance for all replies and moreover thanks for the tools to make factoring happen! -Tim |
![]() |
![]() |
![]() |
#2 | |
"Ed Hall"
Dec 2009
Adirondack Mtns
2×5×521 Posts |
![]() Quote:
|
|
![]() |
![]() |
![]() |
#3 |
Dec 2015
32 Posts |
![]()
EdH, Thanks for the link. I did learn some things from your test that you posted. My cluster is a bit larger than yours, so I hope to see some decent results. I can run 768 threads and the avg. cpu speed is 4.3gHz. I just hope I don't spend a lot of time to see that the time to factor is years as that would break me just in a power bill.
I hope JasonP or others will chime in. -Tim |
![]() |
![]() |
![]() |
#4 |
"Curtis"
Feb 2005
Riverside, CA
562210 Posts |
![]()
If you mean 1024-bit general numbers (GNFS), you'll need roughly 10,000 of your 768-thread machines for a period of 5 (give or take an order of magnitude) years. A year of your machine might be enough to do the sieving step of a record or near-record GNFS factorization, but the matrix might be a problem.
If you mean ~1024-bit special numbers (say, 2^991+1), using SNFS, your machine will do nicely. The sieving step is trivially parallel- you can use a simple script to create n jobs for n different ranges, and then just collect the resulting data files in one location. The matrix step has substantial exchange of data among threads; this is where MPI is needed/sensitive to setup. fivemack uses a 48-core box with MPI to solve matrices as large as kilobit SNFS, and the CADO folks used a 12x12 core cluster to solve the matrix for RSA768 (a GNFS factorization). The number of cores that can be effective for the matrix-solving step is limited by how fast the interconnect is, and the size of problem you can do the matrix for is limited by the amount of memory each socket has. CUDA is used only for GNFS polynomial selection; if you run SNFS candidates, there is no poly select step and thus no need for CUDA. The CUDA step uses just one core of CPU per GPU, and is customarily run for 3-5% of the time the whole project is expected to take. There isn't any point in engaging your cluster for this step; every machine with a GPU would run its own copy of msieve to independently look for a poly, with the best overall result used for the sieving step. |
![]() |
![]() |
![]() |
#5 |
"Curtis"
Feb 2005
Riverside, CA
2×3×937 Posts |
![]()
Have a look at the work distribution and discussion of matrix for a recent kilobit SNFS solved by this forum at http://mersenneforum.org/showthread.php?t=19821
And a discussion of how big a job kilobit GNFS is at http://mersenneforum.org/showthread.php?t=10382 |
![]() |
![]() |
![]() |
#6 | |
Dec 2015
910 Posts |
![]() Quote:
On another note, I can calculate the upper 511 bits of a private key from the public key. Is this of any use? Probably not but worth asking. Trying all combinations of the lower 513 bits is 1000's of years. But then again, maybe there is a relation between the two that I am not seeing and it could be of use. -Tim |
|
![]() |
![]() |
![]() |
#7 | |
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
160658 Posts |
![]() Quote:
Last fiddled with by Dubslow on 2015-12-05 at 18:11 |
|
![]() |
![]() |
![]() |
#8 | |
Dec 2015
32 Posts |
![]() Quote:
The P-1 was an idea some time back but I never explored it much. And I only have one private key that I am after at the moment. I have been thinking for a few years about partial factorization. I.E. a way to factor the upper half of the two primes using the public key (part or all of it) and the upper bits of the private key, but I have not came up with much other than brute forcing it (not a viable option). -Tim |
|
![]() |
![]() |
![]() |
#9 | |
Basketry That Evening!
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88
3×29×83 Posts |
![]() Quote:
Now about this "upper half" business, you'll need to me more specific. You mean you want the most significant bits of the private key or its primes? Even if this was possible, why would you ever want this? |
|
![]() |
![]() |
![]() |
#10 | |||
Dec 2015
32 Posts |
![]()
To be able to digitally sign the hash of data files.
Quote:
Quote:
Quote:
-Tim |
|||
![]() |
![]() |
![]() |
#11 |
"Curtis"
Feb 2005
Riverside, CA
2×3×937 Posts |
![]()
You are correct in your belief that calculating the upper-half of a private key is utterly useless in determining the lower half. It does reduce the search space for a brute-force "guess the number" attack, but not enough to make brute force even remotely worthwhile.
The general concept of P-1 factoring: If the private-key prime P is such that the number P-1 is factorable into small numbers, then the P-1 factoring method will discover the prime P when run against the 1024-bit input. "small" is defined by the bounds used for the P-1 run: P-1 must break down into all-but-one factors less than B1, with the but-one less than B2. B1 and B2 are user-selectable bounds set when P-1 is run, but the time for the run scales (~linearly) with B1 choice while memory and time scale (sublinearly) with B2 choice. P-1 method is part of the GMP-ECM package, invoked with the -pm1 flag. Dubslow's observation that it's easy to check a created key against P-1 means that good RSA implementations will check the factorization of P-1 to make sure it's not smooth enough for the key to be cracked this way. However, if a key-creating package doesn't have that check implemented, it's possible to crack the 1024-bit input with P-1. It shouldn't work, but it could if the key was poorly created. Last fiddled with by VBCurtis on 2015-12-05 at 20:52 Reason: disambiguation P-1 number vs P-1 method |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Cluster software | fivemack | Software | 5 | 2016-09-27 22:13 |
how to run msieve or cado-nfs on mpi-cluster? | ravlyuchenko | Msieve | 1 | 2011-08-16 12:12 |
GGNFS under SuSe cluster | VolMike | Factoring | 7 | 2008-01-23 01:23 |
Prime95 on a Cluster??? | georgekh | Software | 22 | 2004-11-09 14:39 |
Cluster @ MSRC | smh | NFSNET Discussion | 1 | 2003-08-12 08:52 |