![]() |
I'm gonna hand-wave here, since only a few people have bothered taking data:
When a relation set is right at the cusp of building a matrix, a few more hours sieving will save more than a few hours to solve the matrix on that same machine (meaning CPU in both cases). At the relation counts most e-small and 15e jobs are processed at, 20 more core-hours of sieving might save 5 or 10 core-hours of matrix work (again, both measured on a CPU). I've done a few experiments at home, and I have yet to find a job where the sieving required to build a matrix at TD=120 saved more CPU time than it cost. I believe this could/would be the case on really big jobs, say with matrices at 50M+ in size. We have historically sieved more than needed because BOINC computation is cheap, while matrix solving time was in short supply. So, now that GPU matrix solving makes matrices not in short supply, we should sieve less. Something like 5-10% fewer relations, which means 5-10% more jobs done per calendar month. |
[QUOTE=Xyzzy;588059]Is the bottleneck server storage space?[/QUOTE]
No. The server is currently using 467G of 3.6T. |
For 2,2174L, 1355M relations yielded 734M uniques. With nearly 50% duplicates, we have clearly reached the limit for 16e. Anyway, filtering yielded
[CODE]matrix is 102063424 x 102063602 (51045.3 MB) with weight 14484270868 (141.91/col)[/CODE] Normally I'd try to bring this down, but testing on a quad V100 system with NVLink gives [CODE]linear algebra completed 2200905 of 102060161 dimensions (2.2%, ETA 129h 5m)[/CODE] So more sieving would only save a day or so in LA. I have the cluster time, so I'll let it run. |
[QUOTE=VBCurtis;588061]
We have historically sieved more than needed because BOINC computation is cheap, while matrix solving time was in short supply. So, now that GPU matrix solving makes matrices not in short supply, we should sieve less. Something like 5-10% fewer relations, which means 5-10% more jobs done per calendar month.[/QUOTE] Totally agree with you now. And more, when someone says a number is under LA I would recommend (I know Greg!…lol) to cancel all queued wus, this will also speed up next number to sieve. Sievers are wasting a few days (my experience) processing unnecessary work ( I just manually abort them to go to someone else), just be careful to not do this under any challenges since it will interfere with strategic bunkering. |
1 Attachment(s)
[QUOTE=Xyzzy;584798]If you are using RHEL 8 (8.4) you can install the proprietary Nvidia driver easily via these directions:
[URL]https://developer.nvidia.com/blog/streamlining-nvidia-driver-deployment-on-rhel-8-with-modularity-streams/[/URL] Then you will need these packages installed: [C]gcc make cuda-nvcc-10-2 cuda-cudart-dev-10-2-10.2.89-1[/C] And possibly: [C]gmp-devel zlib-devel[/C] You also have to manually adjust your path variable in [C]~/.bashrc[/C]: [C]export PATH="/usr/local/cuda-10.2/bin:$PATH"[/C] :mike:[/QUOTE]Here are simpler instructions. [CODE]sudo subscription-manager repos --enable=rhel-8-for-x86_64-appstream-rpms sudo subscription-manager repos --enable=rhel-8-for-x86_64-baseos-rpms sudo subscription-manager repos --enable=codeready-builder-for-rhel-8-x86_64-rpms sudo dnf config-manager --add-repo=https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo sudo dnf module install nvidia-driver:latest sudo reboot sudo dnf install cuda-11-4 echo 'export PATH=/usr/local/cuda-11.4/bin:$PATH' >> ~/.bashrc echo 'export LD_LIBRARY_PATH=/usr/local/cuda-11.4/lib64/:$LD_LIBRARY_PATH' >> ~/.bashrc source ~/.bashrc[/CODE]Then just use the attached archive to set up your work. :mike: |
[QUOTE=frmky;588086]For 2,2174L, 1355M relations yielded 734M uniques. With nearly 50% duplicates, we have clearly reached the limit for 16e.[/QUOTE]
Or is this just the limit for 16e with 33-bit large primes? I know you've avoided going higher because of the difficulty of the LA and the msieve filtering bug, but now that the bug is fixed and GPUs make the LA much easier, might it be worth going up to 34-bit? |
[QUOTE=charybdis;588104]Or is this just the limit for 16e with 33-bit large primes?[/QUOTE]
Does the lasieve5 code work correctly with 34-bit large primes? I know the check is commented out, but I haven't tested it. |
I tested the binary from [URL="https://www.mersenneforum.org/showpost.php?p=470249&postcount=10"]here[/URL] on 2,2174L with 34-bit large primes and it seemed to work fine. Yield was more than double that at 33-bit so definitely looks worth it, as one would expect. There were no issues with setting mfba=99 either.
|
I looked through the code a few years ago and found no issues. Lasieve4 is also fine although it is limited to 96 bit mfba/r.
|
I give a try to receive NFS@Home WU and found lpbr and lpba 34 assignment of 2,2174M.
Here is the polynomial file S2M2174b.poly's contents. [CODE] n: 470349924831928271476705309712184283829671891500377511256458133476241008159328553358384317181001385841345904968378352588310952651779460262173005355061503024245423661736289481941107679294474063050602745740433565487767078338816787736757703231764661986524341166060777900926495463269979500293362217153953866146837 skew: 1.22341 c6: 2 c5: 0 c4: 0 c3: 2 c2: 0 c1: 0 c0: 1 Y1: 1 Y0: -3064991081731777716716694054300618367237478244367204352 type: snfs rlim: 250000000 alim: 250000000 lpbr: 34 lpba: 34 mfbr: 99 mfba: 69 rlambda: 3.6 alambda: 2.6 [/CODE] When q is near 784M, the memory used is 743MB. |
[QUOTE=wreck;588461][CODE]
lpbr: 34 lpba: 34 mfbr: 99 mfba: 69 [/CODE][/QUOTE] @frmky, for future reference, when I tested this I found that rational side sieving with *algebraic* 3LP was fastest. This shouldn't be too much of a surprise: the rational norms are larger, but not so much larger that 6 large primes across the two sides should split 4/2 rather than 3/3 (don't forget the special-q is a "free" large prime). |
All times are UTC. The time now is 21:34. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.