![]() |
|
|
#221 | |
|
Jan 2007
.de
228 Posts |
Quote:
|
|
|
|
|
|
|
#222 |
|
Mar 2004
3×127 Posts |
I see you intend to sieve 20000 n parallel, but only upt to 10000000. (a total range of 200G). According to this thred this would take 20000 times as long as one individual n.
Did you (or the sieving tool) taks this property into consideration: Group for example 10 .. 20 n together by moving the exponent to the mantissa. For example: mantissa exponent 1000001 480000 1000001 480001 1000001 480002 ... 1000001 480015 1000001 480016 convert to: 1000001 480000 2000002 480000 4000004 480000 8000008 480000 ... 32768032768 480000 65536065536 480000 In that case you need to sieve up to 20 times less N, depending how large the mantissa can be. So the sieving is onlz 1000 times slower instead of 20000 times. Sieving would neither take longer nor take more memory in case you use hashtables (recommended anyway because of the large ranges) Of course, for LLR you can convert back the candidtes that you can still benefit of small mantissa. |
|
|
|
|
|
#223 |
|
Mar 2006
Germany
2×1,531 Posts |
no need for that, LLR will reduce automatically from 65536065536*2^480000-1 to 1000001*2^480016-1!
Last fiddled with by kar_bon on 2009-08-11 at 16:40 |
|
|
|
|
|
#224 |
|
Jan 2005
Caught in a sieve
5×79 Posts |
OK, it's time for the good, the bad, and the ugly!
First, the slightly bad news: I haven't figured out how to compile a 64-bit Windows version. Anyone's welcome to try; but I don't know if Geoff would prefer to do it himself? Second, the really good news, for 32-bit users on processors with SSE2. I made an SSE2-enabled 32-bit app that's about 75% as fast as the 64-bit app, for many-N sieves. That's about three times as fast as the regular 32-bit app! (I haven't checked which is faster for single-N sieves; but there shouldn't be much difference.) My zipfile now includes both 32-bit versions for both Windows and Linux, and a 64-bit Linux version. Finally, the ugly - which is what coding biwema's suggestion would be. Right now, checking the second N is about 60-80 times as fast as the first N, because all one has to do is divide the result for the first N by 2 (mod P). This takes two short lines of code. To do what biwema suggests would first require checking whether even K's between kmax and, say, 65536*kmax, have enough 0's on the right that shifting right would put the K between kmin and kmax. This is unlikely, so it would probably require counting the 0's individually. This is pretty much as slow as doing the above divide, unless there's a processor-specific instruction to count right 0's (there may be in SSE4 or something.) Even assuming you get that working quickly, you still have to divide by 2^16 (mod P). Of course the way to do this is to multiply by 2^-16 (mod P). I'm sure this would never be faster than the new SSE2 code in 32 bits, and it's not all that fast even with 64 bits. - Edit: strike that, that's a multiply, not an exponentiation. It's still slow, but not that slow. Still, I'm not planning to try this soon. But if anyone's willing to try it, please, be my guest.
Last fiddled with by Ken_g6 on 2009-08-11 at 21:14 |
|
|
|
|
|
#225 | |
|
Jun 2003
23×683 Posts |
Quote:
I am testing this one as soon as I get back from work.Biwema assumed that sieving 20000n in parallel is 20000x slower. But we know that that is not true. So the evaluation needs to be adjusted accordingly. |
|
|
|
|
|
|
#226 | |
|
Jan 2007
.de
2×32 Posts |
Quote:
Code:
gcc -Wall -O3 -DNDEBUG -D_REENTRANT -m64 -I. -I.. -s -o tpsieve-x86_64-windows.exe ..\main.c ..\sieve.c ..\clock.c ..\util.c app.c -lm -lpthread ..\main.c: In function 'thread_fun': ..\main.c:529: warning: cast from pointer to integer of different size ..\main.c: In function 'thread_fun_wrapper': ..\main.c:653: warning: cast from pointer to integer of different size ..\main.c: In function 'main': ..\main.c:755: warning: cast to pointer from integer of different size |
|
|
|
|
|
|
#227 |
|
Dec 2006
Anchorage, Alaska
2×3×13 Posts |
I'm not savvy with the tpseive as of yet (never even used it), but which is quickest? I would assume linux x64 is faster than win32?
|
|
|
|
|
|
#228 | |
|
Account Deleted
"Tim Sorbera"
Aug 2006
San Antonio, TX USA
11×389 Posts |
Quote:
![]() So the 32-bit version was only 25% the speed of the 64-bit version? (from: "75% as fast as the 64-bit app" and "three times as fast as the regular 32-bit app") I thought 32-bit versions of math programs, in general, were at worst 50% the speed of the 64-bit versions? |
|
|
|
|
|
|
#229 | |
|
Jan 2005
Caught in a sieve
5×79 Posts |
pschoefer: I get those warnings every time too, compiling for 64-bit Linux. It doesn't seem to be a problem. You can expect to find the same factors, but occasionally they'll be in a different order.
Skligmund: Linux vs. Windows shouldn't matter. (In theory, that is; no guarantees.) x64 > SSE2 > 32 bit. Quote:
That's the way it is. I don't think there's really a hard and fast rule. Maybe that only applies to 32-bit apps with SSE2?
|
|
|
|
|
|
|
#230 |
|
Jun 2003
23·683 Posts |
|
|
|
|
|
|
#231 | |
|
Account Deleted
"Tim Sorbera"
Aug 2006
San Antonio, TX USA
11·389 Posts |
Quote:
|
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| S9 and general sieving discussion | Lennart | Conjectures 'R Us | 31 | 2014-09-14 15:14 |
| Sieving discussion thread | philmoore | Five or Bust - The Dual Sierpinski Problem | 66 | 2010-02-10 14:34 |
| Combined sieving discussion | ltd | Prime Sierpinski Project | 76 | 2008-07-25 11:44 |
| Sieving Discussion | ltd | Prime Sierpinski Project | 26 | 2005-11-01 07:45 |
| Sieving Discussion | R.D. Silverman | Factoring | 7 | 2005-09-30 12:57 |