![]() |
[QUOTE=Prime95;234023]I must be missing something in the bucket code. If I read the code correctly, most of the global variables are 1MB at most. The only big memory chunk is allocated via malloc using the MEM_FOR_POINT_UPDATES #define which can be adjusted up or down to suit your needs.
What is MEM_FOR_POINT_UPDATES set to? Which other bucket globals take up significant space?[/QUOTE] It is this define that is causing the problem. For a 10K x 20K lattice, it is set at 100 * 6400 * 1024. If I crank it down, I get core dumps because the bucket code starts to overwrite locations that it shouldn't. For a 9K x 18K lattice, it is set at 81 * 6400 * 1024 For 8K x 16K it is set at 64 * 6400 * 1024 If I reduce it, I get core dumps. |
[QUOTE=R.D. Silverman;234027]It is this define that is causing the problem.
If I reduce it, I get core dumps.[/QUOTE] Oh, we have a bug in the bucket code. The code is supposed to handle the case where all the points don't fit in the allocated memory. If you can't find and fix the bug, I can look at it next week. I may need you to package up a sample case and the current code. My copy of your code is 5 years old. |
[QUOTE=R.D. Silverman;234027]It is this define that is causing the problem.
For a 10K x 20K lattice, it is set at 100 * 6400 * 1024. If I crank it down, I get core dumps because the bucket code starts to overwrite locations that it shouldn't. For a 9K x 18K lattice, it is set at 81 * 6400 * 1024 For 8K x 16K it is set at 64 * 6400 * 1024 If I reduce it, I get core dumps.[/QUOTE] Here is a memory breakdown.... 10K x 20K sieve area (broken into two pieces) 6 million ideals factor base primes and their roots 6M x 4 x 2 = 48Mbytes storage for primes when resieving 24M reduced lattices for ideals 16bytes/ideal 96M logs of primes 6M sieve array 105M memory for buckets 655M storage to record success locations after algebraic sieve (I've seen as many as 45M potential successes for small q on a quartic) 180M (i can reduce this for larger q's) it could also be reduced by sieving the rational side FIRST. total: 1.1G (plus incidentals) I can get rid of the storage of success locations entirely if I do a time/space tradeoff. After I sieve the algebraic side, I scan the sieve region, looking for potential hits. Because algebraic norms are small, there are a LOT of them. Then, after sieving the rational side, I scan the algebraic success points only. By scanning all of the points, I could do away with this array entirely. For sextics, this array is much much smaller. |
[QUOTE=Prime95;234032]Oh, we have a bug in the bucket code. The code is supposed to handle the case where all the points don't fit in the allocated memory.
If you can't find and fix the bug, I can look at it next week. I may need you to package up a sample case and the current code. My copy of your code is 5 years old.[/QUOTE] I've made a fair number of improvements since then as well. I will bundle up the current code and input data and send it [b]at your convenience.[/b] Let me know when. I won't bother with the source code for creating the factor base files. I will just send those as zipped binaries. I also note the following. Greg did a sample run of the GGNFS siever on my data. (same factor base sizes, LP bounds, sieve area etc.) For q near 60M, he got 8.5 relations per q at an average rate of about .65 seconds per relation. This is about 5.5 seconds per q. Our sieve takes over 6 seconds by itself just do do the bucket sieve on my fastest machine (3.2Ghz I5). My home PC is an i7 but it is only 2.6Ghz/core. Add the other overhead, and my siever is about 2.2 times slower than GGNFS for this quartic. This is one reason why I want to try to rework the bucket sieve. I understand why it is slower. Inherently, the Pollard approach has quite a bit more overhead in setting up the sieve regions. There are some time/memory tradeoffs that will save me 10-15%, but I have to bring down the memory footprint first. |
[QUOTE=frmky;233805]Yep, that's low. Using
[CODE]n: <the number> skew: 1.41 c4: 1 c3: -2 c2: -6 c1: 12 c0: -4 Y1: 2^93 Y0: -(2^187+1) alim: 14500000 rlim: 86000000 lpba: 30 lpbr: 31 mfba: 60 mfbr: 62 alambda: 2.6 rlambda: 2.6 [/CODE] which I believe matches your parameters, and sieving using 14e (which has a slightly smaller sieve area of 16K x 8K), I sieved the range q = 60M to 60M+500. There were 27 special-q in the range, and it found 229 relations at a rate of 0.64 sec/relation on the slow Opteron (a Core 2 would get under 0.4 sec/relation). This is about 8.5 rels/special-q using a smaller sieve region.[/QUOTE] I found out why you are getting a slightly greater yield than I am. You are using a skew of 1.41 (4^1/4). I am using a skew of 1. The skew of 1.41 is slightly more efficient. |
I have tried to model the initial value for skew by simple LSF (log|c[SUB]i[/SUB]| ~ i), and for this poly it suggests 1.58. But sims show that this has almost no difference vs 4[SUP]1/4[/SUP] (note that this usual rule of thumb is equivalent to LSF with two points c[SUB]0[/SUB] and c[SUB]n[/SUB], disregarding the wobbly coefficients in the middle. For two-term SNFS polys, e.g. 3x[SUP]6[/SUP]+1, of course this is the same).
(LSF = I meant, linear regression; reverse engineering my thoughts from Russian, "least square fit") |
[QUOTE=R.D. Silverman;234035]Let me know when. I won't bother with the source code for creating the
factor base files. I will just send those as zipped binaries.[/quote] You can email the zip files to me at [email]woltman@alum.mit.edu[/email] whenever they are ready. Or, if you'd rather, I can download them from an FTP site or a service like sendspace. [quote] Add the other overhead, and my siever is about 2.2 times slower than GGNFS for this quartic.[/QUOTE] So if I understand this correctly, if we reduce the bucket sieve time to zero we are still a tad slower than GGNFS. Does GGNFS use the same sieving strategy (i.e. is our code really *that* much slower) or do we need to make some algorithmic changes too? |
[QUOTE=Prime95;234040]You can email the zip files to me at [email]woltman@alum.mit.edu[/email] whenever they are ready. Or, if you'd rather, I can download them from an FTP site or a service like sendspace.
So if I understand this correctly, if we reduce the bucket sieve time to zero we are still a tad slower than GGNFS. Does GGNFS use the same sieving strategy (i.e. is our code really *that* much slower) or do we need to make some algorithmic changes too?[/QUOTE] The Pollard approach has inherently more overhead. I do not know what GGNFS uses for its sieve strategy. It is certain that it does not sieve over (e,f) for a prime p such that e.g. (0, -M) < e * v1 + f * v2 < (N,M). Doing this entails computing sieve boundaries, then figuring out how (e.f) stay inside the skewed parallelogram. GG does not need to even compute the v1 and v2 vectors for each prime. Just computing the sieve start points and the reduced lattices takes about 3 seconds for 6 million primes. They may also have a more efficient technique for scanning the sieve array after sieving, looking for successful hits. I know how to reduce some of the overhead. But it requires additional memory. |
[QUOTE=R.D. Silverman;233984]I don't see how it can do that. Suppose for a given special q, we are
sieving one ideal with norm p. The hits that p takes will be all over the (c,d) plane. How can one restrict to some subset of the plane? [/QUOTE] You could add a next pointer to each factor base entry, 'sort' them in place, and then proceed region by region. Each region picks the head of a linked list of updates, then proceeds FB entry by FB entry and fills in a narrow rectangle of (c,d) points, then each entry computes its 'next' (c,d) point (say in lexicographic order) and gets prepended to the bucket containing the next point. You get cache misses traversing the linked list but not for any other reason, and you can at least use prefetches for those. Msieve's line sieve works that way, but it only has to deal with one line at a time. Not to say that that's what the KF siever does; the only description of that code I could find was 'Contiued Fractions and Lattice Sieving', a short paper by Franke and Kleinjung. |
[QUOTE=jasonp;234043]You could add a next pointer to each factor base entry, 'sort' them in place,
[/QUOTE] You probably don't want to sort in place. Do an index sort; it involves less data movement. [QUOTE] and then proceed region by region. Each region picks the head of a linked list of updates, then proceeds FB entry by FB entry and fills in a narrow rectangle of (c,d) points, then each entry computes its 'next' (c,d) point (say in lexicographic order) and gets prepended to the bucket containing the next point. You get cache misses traversing the linked list but not for any other reason, [/QUOTE] Indeed. It is these misses that concern me. And the 'prepending' may cause misses as well. [QUOTE] Not to say that that's what the KF siever does; the only description of that code I could find was 'Contiued Fractions and Lattice Sieving', a short paper by Franke and Kleinjung.[/QUOTE] I was asked to referee that paper..... I wrote that it left out too much material; did not adequately explain many things; made some unsubstantiated claims; etc. and that it needed a MAJOR overhaul before publishing. It was first published in a venue known for slack editing. My suggesttions were basically ignored and it was published essentially as is. I complained to the editors. I got ignored. |
[QUOTE=Prime95;234040]You can email the zip files to me at [email]woltman@alum.mit.edu[/email] whenever they are ready. Or, if you'd rather, I can download them from an FTP site or a service like sendspace.
So if I understand this correctly, if we reduce the bucket sieve time to zero we are still a tad slower than GGNFS. Does GGNFS use the same sieving strategy (i.e. is our code really *that* much slower) or do we need to make some algorithmic changes too?[/QUOTE] I have mailed the code along with a README file that contains a lot of info. GGNFS does indeed use a totally different sieving strategy. Bob |
| All times are UTC. The time now is 08:04. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.