mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > YAFU

Reply
 
Thread Tools
Old 2022-10-15, 03:34   #45
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

22·941 Posts
Default

Quote:
Originally Posted by wreck View Post
1. Validity check.
For safety reason, I write a small program and
have verified that for the small test relation set (R1340L, q=316M, -c 1000),
the avx512 ggnfs version's relations are all belong to
the official ggnfs version's.
Thank you! I have also done similar tests.

Quote:
Originally Posted by wreck View Post
2. Other run result.
Another two persons run the 533 binary under Linux, all failed.
Linux on AMD-computer: AVX512-error
Linux on Intel-computer: illegal command.
I am curious if without using avx512 instruction set,
whether ggnfs is still faster when using ecm-tiny.
And if there is a method to detect the CPU not
has avx512, it would be better to use native code
automatically.
Putting some usability checks in there is something I've been meaning to do, to make sure the cpu has the required instructions before proceeding. Looks like sooner rather than later would be good.

Quote:
Originally Posted by wreck View Post
3. About the local build.
When I build gnfs-lasieve4I16e, it pop some errors, I change
the Makefile under lasieve_64/asm,
add a line "CC=icc", then build again,
the gnfs-lasieve4I16e could build successfully, and
also could run smoothly.
When building, there is a warning says a library has
no static version and it run using dynamic version.
Yes, you can also add CC=icc on the make line. Here is how I build:

cd asm/
make liblasieve.a CC=icc AVX512_TD=1
make liblasieveI11.a CC=icc AVX512_TD=1
make liblasieveI12.a CC=icc AVX512_TD=1
make liblasieveI13.a CC=icc AVX512_TD=1
make liblasieveI14.a CC=icc AVX512_TD=1
make liblasieveI15.a CC=icc AVX512_TD=1
make liblasieveI16.a CC=icc AVX512_TD=1
cd ..
cp asm/liblasieve*.a .
make all CC=icc AVX512_ALL=1 LASTATS=1

That last one is optional; it will provide timing for lasched and more accurate timings for the other categories if you run with -v.

Quote:
Originally Posted by charybdis View Post
This limitation was already removed in lasieve5. Have you been modifying lasieve4 or lasieve5?
4
bsquared is offline   Reply With Quote
Old 2022-10-15, 12:16   #46
charybdis
 
charybdis's Avatar
 
Apr 2020

2·33·19 Posts
Default

Quote:
Originally Posted by bsquared View Post
4
Would it be possible to make similar changes to lasieve5, as that's what NFS@Home 16e uses? It's already slightly faster than lasieve4.
charybdis is offline   Reply With Quote
Old 2022-10-15, 15:04   #47
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

22·941 Posts
Default

Quote:
Originally Posted by charybdis View Post
Would it be possible to make similar changes to lasieve5, as that's what NFS@Home 16e uses? It's already slightly faster than lasieve4.
I assume it's similar enough that the changes would also apply, but I've never seen it so I can't say for sure.

Is this it? Or is there somewhere else?
bsquared is offline   Reply With Quote
Old 2022-10-16, 17:06   #48
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

1110101101002 Posts
Default

I discovered that the missing factors in tinyecm processing of lpbr/a > 32 jobs are all a fairly specific class of inputs... namely 2LP's that are composed of two factors >= 32 bits, such that the input large factor is greater than 64 bits but <= lpbr/a*2 in size.

Fortunately, these are easy to identify and split using either mpqs or more effort in tinyecm. Now we find almost all the factors that pure mpqs does, still at a small fraction of the effort. Very large 3LP's may still be missed here and there, but I expect this factor finding rate should largely hold.

Code:
time ./gnfs-lasieve4I16e -v -f 316000000 -c 1000 -a R1340L_poly.txt -o R1340L_16e_a_316000000_316001000.out.12
gnfs-lasieve4I16e (with asm64,avx-512 mmx-td,avx-512 lasetup,avx-512 lasched,avx-512 sieve1,avx-512 ecm): L1_BITS=15
Warning:  lowering FB_bound to 315999999.
FBsize 26351441+0 (deg 8), 26355865+0 (deg 1)
total yield: 1242, q=316001009 (0.77841 sec/rel) ETA 0h00m)
48 Special q, 369 reduction iterations
reports: 239715573->22542070->20471524->18368663->7200755->2605199
Number of relations with k rational and l algebraic primes for (k,l)=:

Total yield: 1242
0/0 mpqs failures, 1108/20196 vain mpqs
milliseconds total: Sieve 210330 Sched 416710 medsched 840
TD 161120 (Init 4220, MPQS 30740) Sieve-Change 30, lasieve_setup 177760
TD side 0: init/small/medium/large/search: 2420 32510 900 22630 12730
sieve: init/small/medium/large/search: 3370 50940 1320 34640 11040
TD side 1: init/small/medium/large/search: 3110 22690 1120 21560 5470
sieve: init/small/medium/large/search: 3810 68060 1130 33790 2230
953.632u 15.924s 16:10.01 99.9% 0+0k 2104+312io 1pf+0w
New code has been checked in.
bsquared is offline   Reply With Quote
Old 2022-10-16, 18:11   #49
kruoli
 
kruoli's Avatar
 
"Oliver"
Sep 2017
Porta Westfalica, DE

7×223 Posts
Default

Would you mind trying to build it as C99, so that your compiler complains about implicit declarations (maybe with -Werr)? I then can give it a try with ICX since it will eliminate a lot of guesswork. Thanks.
kruoli is offline   Reply With Quote
Old 2022-10-16, 18:27   #50
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

376410 Posts
Default

Quote:
Originally Posted by kruoli View Post
Would you mind trying to build it as C99, so that your compiler complains about implicit declarations (maybe with -Werr)? I then can give it a try with ICX since it will eliminate a lot of guesswork. Thanks.
I am in the middle of doing that now with a newly installed icx from here. Lots of implicit function declarations here, oof.
bsquared is offline   Reply With Quote
Old 2022-10-16, 18:45   #51
kruoli
 
kruoli's Avatar
 
"Oliver"
Sep 2017
Porta Westfalica, DE

30318 Posts
Default

At least for the YAFU code, I should have resolved the vast majority of them in my other thread.

Great to hear you trying this!
kruoli is offline   Reply With Quote
Old 2022-10-16, 23:07   #52
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

22×941 Posts
Default

The sievers should now build with CC=icx with all of the new AVX512 code. To others that may not know, icx can be downloaded for free from Intel. I was not aware of this until a few days ago.

If you wouldn't mind doing some sanity checking by comparing small runs with the new versions against the old/original sievers I would appreciate it.
bsquared is offline   Reply With Quote
Old 2022-10-17, 14:25   #53
bsquared
 
bsquared's Avatar
 
"Ben"
Feb 2007

22·941 Posts
Default

Quote:
Originally Posted by VBCurtis View Post
I recognise that you're not offering to become the official ggnfs dev, but your tinyecm speed enhancements make ggnfs massively more interesting than it was a few months ago for cutting-edge work.

The cutting edge would benefit from the 16e siever working properly with -J 16 flag, which would make it effectively 16.5e. This flag works on 15e -J 15, and sometimes works as 16e -J 16 but sometimes crashes. There's a small chance those crashes can be fixed, and even a chance your new code happens to remedy the code path that caused the intermittent crashing.

If -J 16 can be used, we in principle could factor SNFS-350 with ggnfs, or GNFS-235ish.

Of course, we can just use CADO for extra-large sieve regions... but a new ggnfs revision holds out hope to be BOINCified to extend the life of the big nfs@home queue.
I tried to run wreck's poly using -J 16 and it immediately errors out:

Code:
./gnfs-lasieve4I16e -v -f 316000000 -c 1000 -a R1340L_poly.txt -J 16 -o R1340L_16e_a_316000000_316001000.out
gnfs-lasieve4I16e (with asm64,avx-512 mmx-td,avx-512 lasetup,avx-512 lasched,avx-512 sieve1,avx-512 ecm): L1_BITS=15
Warning:  lowering FB_bound to 315999999.
FBsize 26351441+0 (deg 8), 26355865+0 (deg 1)
Recurrence init: ub=32768 exceeds 16384
Maybe this is one of the things lasieve5 resolves?
bsquared is offline   Reply With Quote
Old 2022-10-17, 14:52   #54
wreck
 
wreck's Avatar
 
"Bo Chen"
Oct 2005
Wuhan,China

2·3·31 Posts
Default

Quote:
Originally Posted by bsquared View Post
I assume it's similar enough that the changes would also apply, but I've never seen it so I can't say for sure.

Is this it? Or is there somewhere else?
I think it is, this source needs cweb command, perhaps it means command ctangle should could run normally.

I could build that source 5 years ago, but it is a little strange that now I cann't compile it.

Also there is a github Greg commit, (search lasieve5 on github, you will found Greg's github), perhaps it is
newer, Greg's github still not support degree 8, but I think he already finish the code, since NFS@home
could tackle degree 8 normally.
wreck is offline   Reply With Quote
Old 2022-10-17, 16:05   #55
charybdis
 
charybdis's Avatar
 
Apr 2020

2×33×19 Posts
Default

Quote:
Originally Posted by bsquared View Post
I assume it's similar enough that the changes would also apply, but I've never seen it so I can't say for sure.

Is this it? Or is there somewhere else?
I can't remember where I got it from. I do recall having trouble compiling it; I have a feeling that I couldn't get the code in that post to build, even with the changes in that thread. I think I found some corrected code somewhere else on the forum. Sorry I can't be more helpful.
charybdis is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
yafu ignoring yafu.ini chris2be8 YAFU 9 2022-02-17 17:52
YAFU + GGNFS Confirmation nivek000 YAFU 1 2021-12-10 22:35
Running YAFU via Aliqueit doesn't find yafu.ini EdH YAFU 8 2018-03-14 17:22
GGNFS or something better? Zeta-Flux Factoring 1 2007-08-07 22:40
ggnfs ATH Factoring 3 2006-08-12 22:50

All times are UTC. The time now is 15:39.


Fri Jun 9 15:39:35 UTC 2023 up 295 days, 13:08, 0 users, load averages: 1.74, 1.17, 0.98

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔