mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > YAFU

Reply
 
Thread Tools
Old 2020-03-31, 15:42   #23
Till
 
Till's Avatar
 
"Tilman Neumann"
Jan 2016
Germany

3·139 Posts
Default

Quote:
Originally Posted by bsquared View Post
I'm trying to weigh whether or not to try out the ideas in the paper.

The paper has timings for the main computational steps of the class group computations for imaginary quadratic fields with d-digit discriminants. They mention that the sieving step of this computation is the same as that used during SIQS integer factorization, but I don't know much more than that about class group computations. Anyway, the table reports a time of 2.23 core years spent in the sieving step for a C130 input. Their core is a 1.9GHz Opteron core.

I used a 7210 KNL processor that has 256 virtual cores (64 physical cores, each with 4 hyperthreads) that run at 1.5 GHz.

The C135 sieving took 74 wall clock hours * 256 cores = 18,944 core-hours = 2.16 core years. Although I'm not sure how to handle the hyperthreads... if you only count physical cores (performance certainly does not scale linearly past 64 cores on this processor) then it spent 0.54 core years. Call it double that to 1.08 core-years to account for the diminishing return of the hyperthreads.

Scaling the ratio of core-years by the ratio of clock frequencies and inversely by the ratio of effort I get:
2.23 / 1.08 * 1.9 / 1.5 * exp(sqrt(ln C135 ln ln C135)) / exp(sqrt(ln C130 ln ln C130))
which if I did that right means 4.88 times the effort for equivalent job sizes.

This still isn't right because the sievers are probably memory bound, not cpu bound, and the two implementations use way different instruction sets (and AVX512 could maybe also accelerate their ideas). But it's enough to tell me that it might not be worth the effort to try to implement it.

Even so, I do need to read more and try to understand it better.

Sorry, didn't see this before...


True, the paper tells us that they use the same sieve as for factoring, and that sieving is the most expensive step in the class group computations. I also wouldn't doubt your estimated performance comparison.


Nonetheless, I think that Kleinjungs SIQS is pretty interesting because it takes such a different approach, resulting in a very different parametrization (very small sieve array etc.). And a factor of 4.88 does not mean much. Knowing that they do have a pretty good GNFS, maybe it was just a kind of toy and the implementation not that highly optimized.
Till is offline   Reply With Quote
Old 2020-03-31, 19:42   #24
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

2·33·61 Posts
Default

Quote:
Originally Posted by Till View Post
Sorry, didn't see this before...


True, the paper tells us that they use the same sieve as for factoring, and that sieving is the most expensive step in the class group computations. I also wouldn't doubt your estimated performance comparison.


Nonetheless, I think that Kleinjungs SIQS is pretty interesting because it takes such a different approach, resulting in a very different parametrization (very small sieve array etc.). And a factor of 4.88 does not mean much. Knowing that they do have a pretty good GNFS, maybe it was just a kind of toy and the implementation not that highly optimized.
I don't think anyone approaches 130 digit problems with any kind of toy implementation. Even so, maybe optimization beyond what they have is possible. In my experience that usually means hand-tuned SIMD assembly. And that kind of work on this scale of algorithm is a months-long intensive effort, at least. That is pretty much a non-starter for me at this point.

If you or anyone else is motivated enough to implement the algorithm I would love to see it. Or maybe someone can request the code from Thorsten and then make the changes necessary for integer factorization problems?
bsquared is offline   Reply With Quote
Old 2020-04-01, 12:58   #25
Till
 
Till's Avatar
 
"Tilman Neumann"
Jan 2016
Germany

3×139 Posts
Default

Quote:
Originally Posted by bsquared View Post
I don't think anyone approaches 130 digit problems with any kind of toy implementation.

Agreed. I meant that in the context of his usual standards, but "not fully optimized" would have been better wording.



Quote:
Originally Posted by bsquared View Post
If you or anyone else is motivated enough to implement the algorithm I would love to see it. Or maybe someone can request the code from Thorsten and then make the changes necessary for integer factorization problems?

Not me at the moment... Asking for the code wouldn't help me either because I'ld rather be interested in a Java implementation. (as strange as that might seem to most of you ;-)
Till is offline   Reply With Quote
Old 2020-05-07, 13:52   #26
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

2×33×61 Posts
Default

I found this thesis comparing Kleinjung's algorithm with traditional SIQS:
https://prism.ucalgary.ca/bitstream/...=2&isAllowed=y

The re-explanation of the algorithm is very helpful. Lots of comparisons are presented on up to 480-bit discriminants. One thing it appears to be lacking, however, is a comparison of a well-tuned traditional siqs with a well-tuned Kleinjung approach. Of course, traditional siqs will be very slow in comparison with Kleinjung when the parameters are the same (e.g., block size 2048). And vice versa (Kleinjung does not do well with large sieve area). I am most curious to see how they compare when each are tuned to their respective strengths.

Still, it is a very nice thesis report. I can see better now that Kleinjung's algorithm is compatible with the rest of siqs factorization (e.g., polynomial generation, filtering and matrix steps), which might ease integration of the new approach with existing implementations.
bsquared is offline   Reply With Quote
Old 2020-05-07, 16:04   #27
Till
 
Till's Avatar
 
"Tilman Neumann"
Jan 2016
Germany

3×139 Posts
Default

Quote:
Originally Posted by bsquared View Post
I found this thesis comparing Kleinjung's algorithm with traditional SIQS:
https://prism.ucalgary.ca/bitstream/...=2&isAllowed=y

The re-explanation of the algorithm is very helpful. Lots of comparisons are presented on up to 480-bit discriminants. One thing it appears to be lacking, however, is a comparison of a well-tuned traditional siqs with a well-tuned Kleinjung approach. Of course, traditional siqs will be very slow in comparison with Kleinjung when the parameters are the same (e.g., block size 2048). And vice versa (Kleinjung does not do well with large sieve area). I am most curious to see how they compare when each are tuned to their respective strengths.

Still, it is a very nice thesis report. I can see better now that Kleinjung's algorithm is compatible with the rest of siqs factorization (e.g., polynomial generation, filtering and matrix steps), which might ease integration of the new approach with existing implementations.

Quite a lengthy read, but currently being on rehab from an abscess op, thats just the right thing for me and I will certainly enjoy it.
Thanks for posting!
Till is offline   Reply With Quote
Old 2020-05-20, 17:00   #28
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

11·317 Posts
Default

Colin Percival, in a blog post around ~2005, called this method the subset sum self initializing quadratic sieve (SSSIQS). I actually implemented most of it, at least enough to test the performance; the good news is that for moderately large QS problems one can indeed get a 2x speedup in the time taken to sieve through a given number of sieve locations. Put another way, if you compare Msieve's default choice of sieve size, times number of SIQS polynomials, and then reparametrize so that instead you use much larger SIQS batches of primes and much fewer sieve locations per polynomial, such that the number of sieve locations is constant, then completing the sieve problem takes half the time.

Unfortunately, the actual relation discovery rate dropped by a factor of 4. Maybe that's because I messed up something, or maybe it's because of the asymptotic smoothness properties of the sieve locations, but it was really surprising that it wasn't a net win.
jasonp is offline   Reply With Quote
Old 2020-05-21, 15:42   #29
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

2×3×1,471 Posts
Default

To put it in a way we could understand it, this would just give you 5 or 6 digits more for the crossover between SIQS and GNFS, if I read it right?
(as the effort/factorization time doubles with every 5-6 digits, about)

Last fiddled with by LaurV on 2020-05-21 at 15:48
LaurV is offline   Reply With Quote
Old 2020-05-21, 16:49   #30
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

11·317 Posts
Default

Percival predicted a factor of two speedup from the method; maybe the factor is larger if one uses a much larger factor base to improve the relation yield and use the subset sum technique to avoid the sieving time increasing as a result. Maybe the factor would be larger if you used the method on problems for which SIQS is ill-suited in the first place. The choice of parameters has a huge effect on the total time for the sieving, it's easy to see a factor of two change either way.

But yes, this only buys a few digits in the crossover between SIQS and GNFS. 2005 was a different time, there were several good implementations of the quadratic sieve but very little publicly available for nontrivial size NFS
jasonp is offline   Reply With Quote
Old 2020-05-21, 17:20   #31
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

2×3×1,471 Posts
Default

Well, then, ok, 7 digits...
So, I still can't beat Ben's record...
Loops and screws, remember? hihi.

It seems I won't be able to check that box this lifetime
LaurV is offline   Reply With Quote
Old 2020-05-21, 18:26   #32
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

2×33×61 Posts
Default

Quote:
Originally Posted by LaurV View Post
Loops and screws, remember? hihi.

It seems I won't be able to check that box this lifetime
You must not have seen this from 2 years ago (I was kinda wondering why I never got a comment from you at the time): https://mersenneforum.org/showthread.php?t=23362

It is hugely inefficient and clumsy but you can loop and if! Even nested loops/ifs, if you are patient enough to craft such a thing:

Code:
./yafu "for(x=3;x<100;x=x+1;for(y=2;y<x;y=y+1;if(isprime(x^y+y^x);x^y+y^x)))"

17
593
32993
2097593
59604644783353249
43143988327398957279342419750374600193
8589935681
5052785737795758503064406447721934417290878968063369478337
4318114567396436564035293097707729426477458833
523347633027360537213687137
814539297859635326656252304265822609649892589675472598580095801187688932052096060144958129
205688069665150755269371147819668813122841983204711281293004769
7259701736680389461922586102342375953169154793471358981661239413987142371528493467259545421437269088935158394128249
3329896365316142756322307042065269797678257903507506764421250291562312417
14612087592038378751152858509524512533536096028044190178822935218486730194880516808459166772134240378240755073828170296740373082348622309614668344831750401
123696767198741648186287940563721003128015158572134981161748692560225922426827257262789498753729852662122870454448694253249972402126255218031127222474177
Code:
>> sum=0;
>> forprime(p=2;100;sum=sum+p*p)
4
13
38
87
208
377
666
1027
1556
2397
3358
4727
6408
8257
10466
13275
16756
20477
24966
30007
35336
41577
48466
56387
65796
>>
Code:
>> input=randb(200);
>> sum_distinct=0;
>> forfactors(factor(input);sum_distinct=sum_distinct+_f;)
fac: factoring 784170029379726309723236630974985183483957194272594097782676
fac: using pretesting plan: normal
fac: using specified qs/gnfs crossover of 93 digits
fac: using specified qs/snfs crossover of 75 digits
div: primes less than 10000
fmt: 1000000 iterations
rho: x^2 + 3, starting 200 iterations on C59
Total factoring time = 0.0406 seconds


***factors found***

P1 = 2
P1 = 2
P2 = 13
P8 = 46630757
P51 = 323395840918624684084681503147649779523761535543109
>> print(sum_distinct);
323395840918624684084681503147649779523761582173881
>>


I don't think I ever even published a help document on how to use it.
Here is what I remember off the top of my head:

for(start_expression;stop_expression;statement)
Not sure if "statement" can be more than one line. But it can have an assignment in it or another "for"

forprime(start_expression; end_condition; statement)
end_condition checks the value of the variable defined in start_expression

forfactors(factor_statement; statement)
factor_statement must be of the form: factor(something)
within your "statement", the variable "_f" refers to each factor in turn and "_fpow" its corresponding power.

if(condition_statement; statement; else_statement)
if the condition statement evaluates to non-zero, then execute statement and optionally execute else_statement if zero (you can omit else_statement if desired)

I think that post had a script with these examples in it. Not sure if anyone has used it since
bsquared is offline   Reply With Quote
Old 2020-05-21, 18:29   #33
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

63368 Posts
Default

Quote:
Originally Posted by bsquared View Post

I think that post had a script with these examples in it. Not sure if anyone has used it since
Aliquot sequences entirely within yafu

Code:
./yafu "for(input=840; input>1; print(input); forfactors(sum=1, factor(input); t=1;, for(i=1; lte(i,_fpow); i=i+1; t=t+_f^i;), sum=sum*t;), input=sum-input;)" -silent
I guess, after looking at that, you can have more than one line in for-loop statements, separated by commas.

Last fiddled with by bsquared on 2020-05-21 at 18:30
bsquared is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Where do I send my PRP primes with large k? Trilo Riesel Prime Search 3 2013-08-20 00:32
48-bit large primes! jasonp Msieve 24 2010-06-01 19:14
NFS with 5 and 6 large primes jasonp Factoring 4 2007-12-04 18:32
Why only three large primes fivemack Factoring 18 2007-05-10 12:14
What is the use of these large primes Prime Monster Lounge 34 2004-06-10 18:12

All times are UTC. The time now is 06:45.

Tue Oct 20 06:45:18 UTC 2020 up 40 days, 3:56, 0 users, load averages: 1.38, 1.56, 1.61

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.