mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Factoring

Reply
 
Thread Tools
Old 2022-01-19, 14:56   #1
EdH
 
EdH's Avatar
 
"Ed Hall"
Dec 2009
Adirondack Mtns

22×1,151 Posts
Default Lattice Sieving with GPU

Please redirect me if the following has been discussed elsewhere.

Quote:
Originally Posted by xilman View Post
. . .
Most GPUs do not have enough memory to be effective with the GNFS.
This partial post from a different thread has prompted the following:

When I run CADO-NFS on my clients, the memory usage is primarily based on the size of the candidate and the options involved, but not so much the number of threads. I'm able to gain a tiny bit of throughput by maximizing the number of clients to use most of the available memory and then adjusting the threads such that all available are in use for those clients. As an example, I can get a little more from two clients running 4 threads each than one client running 8 threads. Each client in all cases uses about the same amount of memory, dependent on the candidate and the options. For the current slate of numbers I'm factoring, the clients are using less than 1G, independent of the number of threads (although my thread count is nowhere near the available threads for a GPU).

For the memory limited machines, I run a single client with maximum threads.

Would a GPU work for lattice sieving for smaller candidates with a single client, perhaps using less than the full set of GPU cores? Or is the parallelization too much different between GPU and CPU multi-threading? (I only did a small amount of CUDA programming back in the CUDA 5 era, so I don't remember the differences.)
EdH is offline   Reply With Quote
Old 2022-01-19, 18:00   #2
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Liverpool (GMT/BST)

135468 Posts
Default

You may find https://eprint.iacr.org/2014/397.pdf interesting
henryzz is offline   Reply With Quote
Old 2022-01-19, 18:56   #3
charybdis
 
charybdis's Avatar
 
Apr 2020

7×107 Posts
Default

As I understand it, a major issue with running lattice sieving on GPUs is that they have poor memory latency compared to CPUs. Every time the sieve hits a lattice entry, that entry must be modified. For largish jobs this will need to be done ~10^9 times per special-q. This is nothing like traditional sieving for prime-searching projects, where (a) the number of candidates is much smaller, and (b) candidates are eliminated from the sieve when a single factor is found.

Cofactorization doesn't have this problem - see the paper that David linked. The amount of time spent in cofactorization is much higher with 3 large primes than 2, so the potential benefit from GPUs will be greater on bigger jobs.
charybdis is offline   Reply With Quote
Old 2022-01-19, 21:21   #4
R. Gerbicz
 
R. Gerbicz's Avatar
 
"Robert Gerbicz"
Oct 2005
Hungary

112·13 Posts
Default

Quote:
Originally Posted by charybdis View Post
As I understand it, a major issue with running lattice sieving on GPUs is that they have poor memory latency compared to CPUs. Every time the sieve hits a lattice entry, that entry must be modified. For largish jobs this will need to be done ~10^9 times per special-q. This is nothing like traditional sieving for prime-searching projects, where (a) the number of candidates is much smaller, and (b) candidates are eliminated from the sieve when a single factor is found.
What about using a radix sort on gpu (to my surprising it has a quite large literature), notice that here:
1. You don't need a full sort, it is enough to reach that a bucket will contain all numbers hits from [x,x+2^13) or so, what will fit in the fast shared memory of gpu.
2. In one round you sort by 2-4 bits of the input in the fast (small) memory, read and write from the slow gpu global memory but in almost a coalesced way. Of course you need multiple rounds.
3. We can assume that the input is close to random.
R. Gerbicz is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Lattice Sieving Parameters paul0 Factoring 6 2015-11-20 21:12
Lattice Sieving - where do I start? paul0 Factoring 3 2015-03-09 13:54
Line sieving vs. lattice sieving JHansen NFSNET Discussion 9 2010-06-09 19:25
A question on lattice sieving joral Factoring 5 2008-04-03 08:01
Initialization for lattice sieving jasonp Factoring 16 2006-01-12 22:53

All times are UTC. The time now is 14:36.


Sat Jul 2 14:36:45 UTC 2022 up 79 days, 12:38, 1 user, load averages: 1.07, 1.17, 1.16

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔