mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Msieve

Reply
 
Thread Tools
Old 2015-12-05, 14:13   #1
efiGeek
 
Dec 2015

32 Posts
Default Cuda and a cluster

Hello all,
Long time reader of the forum, first time poster.

Is it possible to use Msieve with Cuda and a cluster? I.E. only the head node will have Cuda and the workers not have Cuda? It is to my understanding that Cuda is only used in one phase, therefor it would not be needed in the worker nodes.

Also, does anyone have any first hand experience with Msieve and MPI? I understand how to set it up (I think), just curious of the speedups that have been seen with large factors. How large of a cluster would one need to factor big values (1024 and above)?

Thanks in advance for all replies and moreover thanks for the tools to make factoring happen!
-Tim
efiGeek is offline   Reply With Quote
Old 2015-12-05, 15:21   #2
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

106278 Posts
Default

Quote:
Originally Posted by efiGeek View Post
Hello all,
Long time reader of the forum, first time poster.

Is it possible to use Msieve with Cuda and a cluster? I.E. only the head node will have Cuda and the workers not have Cuda? It is to my understanding that Cuda is only used in one phase, therefor it would not be needed in the worker nodes.

Also, does anyone have any first hand experience with Msieve and MPI? I understand how to set it up (I think), just curious of the speedups that have been seen with large factors. How large of a cluster would one need to factor big values (1024 and above)?

Thanks in advance for all replies and moreover thanks for the tools to make factoring happen!
-Tim
I have a thread here: http://www.mersenneforum.org/showthr...hlight=Openmpi which describes my novice attempt to run msieve with openMpi. I never really got anywhere, probably due to lack of bandwidth, but possibly due to my short attention span and moving to other interests.
EdH is offline   Reply With Quote
Old 2015-12-05, 15:43   #3
efiGeek
 
Dec 2015

118 Posts
Default

EdH, Thanks for the link. I did learn some things from your test that you posted. My cluster is a bit larger than yours, so I hope to see some decent results. I can run 768 threads and the avg. cpu speed is 4.3gHz. I just hope I don't spend a lot of time to see that the time to factor is years as that would break me just in a power bill.

I hope JasonP or others will chime in.

-Tim
efiGeek is offline   Reply With Quote
Old 2015-12-05, 17:01   #4
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

3·1,759 Posts
Default

If you mean 1024-bit general numbers (GNFS), you'll need roughly 10,000 of your 768-thread machines for a period of 5 (give or take an order of magnitude) years. A year of your machine might be enough to do the sieving step of a record or near-record GNFS factorization, but the matrix might be a problem.

If you mean ~1024-bit special numbers (say, 2^991+1), using SNFS, your machine will do nicely. The sieving step is trivially parallel- you can use a simple script to create n jobs for n different ranges, and then just collect the resulting data files in one location.

The matrix step has substantial exchange of data among threads; this is where MPI is needed/sensitive to setup. fivemack uses a 48-core box with MPI to solve matrices as large as kilobit SNFS, and the CADO folks used a 12x12 core cluster to solve the matrix for RSA768 (a GNFS factorization). The number of cores that can be effective for the matrix-solving step is limited by how fast the interconnect is, and the size of problem you can do the matrix for is limited by the amount of memory each socket has.

CUDA is used only for GNFS polynomial selection; if you run SNFS candidates, there is no poly select step and thus no need for CUDA. The CUDA step uses just one core of CPU per GPU, and is customarily run for 3-5% of the time the whole project is expected to take. There isn't any point in engaging your cluster for this step; every machine with a GPU would run its own copy of msieve to independently look for a poly, with the best overall result used for the sieving step.
VBCurtis is offline   Reply With Quote
Old 2015-12-05, 17:05   #5
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

3×1,759 Posts
Default

Have a look at the work distribution and discussion of matrix for a recent kilobit SNFS solved by this forum at http://mersenneforum.org/showthread.php?t=19821

And a discussion of how big a job kilobit GNFS is at http://mersenneforum.org/showthread.php?t=10382
VBCurtis is offline   Reply With Quote
Old 2015-12-05, 17:53   #6
efiGeek
 
Dec 2015

916 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
Have a look at the work distribution and discussion of matrix for a recent kilobit SNFS solved by this forum at http://mersenneforum.org/showthread.php?t=19821

And a discussion of how big a job kilobit GNFS is at http://mersenneforum.org/showthread.php?t=10382
Thanks for the links and the previous post. I am after RSA1024 factoring, but it seems that even with 10,000 times the computing power that I already have a factor is still years away. Guess I will have to explore some more uncommon methods to attempt to factor it.

On another note, I can calculate the upper 511 bits of a private key from the public key. Is this of any use? Probably not but worth asking. Trying all combinations of the lower 513 bits is 1000's of years. But then again, maybe there is a relation between the two that I am not seeing and it could be of use.

-Tim
efiGeek is offline   Reply With Quote
Old 2015-12-05, 18:09   #7
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×29×83 Posts
Default

Quote:
Originally Posted by efiGeek View Post
Thanks for the links and the previous post. I am after RSA1024 factoring, but it seems that even with 10,000 times the computing power that I already have a factor is still years away. Guess I will have to explore some more uncommon methods to attempt to factor it.

On another note, I can calculate the upper 511 bits of a private key from the public key. Is this of any use? Probably not but worth asking. Trying all combinations of the lower 513 bits is 1000's of years. But then again, maybe there is a relation between the two that I am not seeing and it could be of use.

-Tim
If anything could be done to attack otherwise secure 1024 bit RSA, it would have already been done in the ~40 years that RSA has been around. Unless the private key was generated in an insecure manner (try P-1 factoring, or taking gcds with a whole bunch of other private keys), there's no way to factor it, for doing so would be a major mathematical accomplishment worthy of mainstream press recognition. Research mathematicians have been considering integer factorization for hundreds of years -- anyone here would be highly sceptical of the odds of success if you tried some novel way to do it yourself. Hell, even doing it with GNFS would be extremely impressive and press-worthy.

Last fiddled with by Dubslow on 2015-12-05 at 18:11
Dubslow is offline   Reply With Quote
Old 2015-12-05, 18:20   #8
efiGeek
 
Dec 2015

32 Posts
Default

Quote:
Originally Posted by Dubslow View Post
If anything could be done to attack otherwise secure 1024 bit RSA, it would have already been done in the ~40 years that RSA has been around. Unless the private key was generated in an insecure manner (try P-1 factoring, or taking gcds with a whole bunch of other private keys), there's no way to factor it, for doing so would be a major mathematical accomplishment worthy of mainstream press recognition.
For two years Bleichenbacher's e=3 RSA Attack worked for my needs. Then they started checking the padding bytes so I had to resort to some other methods. Yes, any RSA digital signature that fails to comply to the extent that Bleichenbacher's attack worked was not a well implemented method but I was able to use that exploit. I am not trying to buck the system, but how do others come up with faster and "out of the box" methods? By never stop trying.

The P-1 was an idea some time back but I never explored it much. And I only have one private key that I am after at the moment. I have been thinking for a few years about partial factorization. I.E. a way to factor the upper half of the two primes using the public key (part or all of it) and the upper bits of the private key, but I have not came up with much other than brute forcing it (not a viable option).

-Tim
efiGeek is offline   Reply With Quote
Old 2015-12-05, 18:43   #9
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

1C3516 Posts
Default

Quote:
Originally Posted by efiGeek View Post
For two years Bleichenbacher's e=3 RSA Attack worked for my needs. Then they started checking the padding bytes so I had to resort to some other methods. Yes, any RSA digital signature that fails to comply to the extent that Bleichenbacher's attack worked was not a well implemented method but I was able to use that exploit. I am not trying to buck the system, but how do others come up with faster and "out of the box" methods? By never stop trying.

The P-1 was an idea some time back but I never explored it much. And I only have one private key that I am after at the moment. I have been thinking for a few years about partial factorization. I.E. a way to factor the upper half of the two primes using the public key (part or all of it) and the upper bits of the private key, but I have not came up with much other than brute forcing it (not a viable option).

-Tim
I am quite intrigued. What are "your needs"? At any rate, the e=3 attack as you surely understand wasn't an attack on integer factorization. Again, if P-1 works, then the key was poorly produced (it's easy to check a prime's weakness to P-1/P+1, so any decent implementation will). As for the gcd, yes you may be targeting just one key but it's possible that the key shares one of its primes with other keys, which is especially likely with bad implementations (again). If target key's source is known to not be top quality, it may be worth acquiring other private keys from the same source to run gcds against your target.

Now about this "upper half" business, you'll need to me more specific. You mean you want the most significant bits of the private key or its primes? Even if this was possible, why would you ever want this?
Dubslow is offline   Reply With Quote
Old 2015-12-05, 18:53   #10
efiGeek
 
Dec 2015

32 Posts
Default

Quote:
Originally Posted by Dubslow View Post
I am quite intrigued. What are "your needs"?
To be able to digitally sign the hash of data files.
Quote:
Originally Posted by Dubslow View Post
Again, if P-1 works, then the key was poorly produced (it's easy to check a prime's weakness to P-1/P+1, so any decent implementation will)
I shall read up on this more, not sure I have found the correct source for quality information on it in the past.
Quote:
Originally Posted by Dubslow View Post
As for the gcd, yes you may be targeting just one key but it's possible that the key shares one of its primes with other keys, which is especially likely with bad implementations (again). If target key's source is known to not be top quality, it may be worth acquiring other private keys from the same source to run gcds against your target.
I can't even get 1 private key, no less than getting many. All I have to date is public keys.
Quote:
Originally Posted by Dubslow View Post
Now about this "upper half" business, you'll need to me more specific. You mean you want the most significant bits of the private key or its primes? Even if this was possible, why would you ever want this?
I can with a few simple calculations produce the upper half of a private RSA key that uses an e=3,7,11. I am sure that I am not the only one that has found this relation between the public and private key with a low e. In the embedded world a low e is chosen because the micro's do not have a lot of computing power so that they can decrypt and verify a digital signature in a timely fashion.

-Tim
efiGeek is offline   Reply With Quote
Old 2015-12-05, 20:48   #11
VBCurtis
 
VBCurtis's Avatar
 
"Curtis"
Feb 2005
Riverside, CA

149D16 Posts
Default

You are correct in your belief that calculating the upper-half of a private key is utterly useless in determining the lower half. It does reduce the search space for a brute-force "guess the number" attack, but not enough to make brute force even remotely worthwhile.

The general concept of P-1 factoring: If the private-key prime P is such that the number P-1 is factorable into small numbers, then the P-1 factoring method will discover the prime P when run against the 1024-bit input. "small" is defined by the bounds used for the P-1 run: P-1 must break down into all-but-one factors less than B1, with the but-one less than B2. B1 and B2 are user-selectable bounds set when P-1 is run, but the time for the run scales (~linearly) with B1 choice while memory and time scale (sublinearly) with B2 choice. P-1 method is part of the GMP-ECM package, invoked with the -pm1 flag.

Dubslow's observation that it's easy to check a created key against P-1 means that good RSA implementations will check the factorization of P-1 to make sure it's not smooth enough for the key to be cracked this way. However, if a key-creating package doesn't have that check implemented, it's possible to crack the 1024-bit input with P-1. It shouldn't work, but it could if the key was poorly created.

Last fiddled with by VBCurtis on 2015-12-05 at 20:52 Reason: disambiguation P-1 number vs P-1 method
VBCurtis is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Cluster software fivemack Software 5 2016-09-27 22:13
how to run msieve or cado-nfs on mpi-cluster? ravlyuchenko Msieve 1 2011-08-16 12:12
GGNFS under SuSe cluster VolMike Factoring 7 2008-01-23 01:23
Prime95 on a Cluster??? georgekh Software 22 2004-11-09 14:39
Cluster @ MSRC smh NFSNET Discussion 1 2003-08-12 08:52

All times are UTC. The time now is 15:21.


Wed May 18 15:21:19 UTC 2022 up 34 days, 13:22, 1 user, load averages: 2.21, 1.76, 1.75

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔