20140603, 09:12  #34  
Jan 2005
Minsk, Belarus
620_{8} Posts 
Quote:
http://www.primefan.ru/stuff/soft/ms.zip 

20140603, 11:53  #35  
"Mark"
Apr 2003
Between here and the
2^{4}·3·5·29 Posts 
Quote:


20140612, 17:31  #36 
"Mark"
Apr 2003
Between here and the
2^{4}×3×5×29 Posts 
With both MultiSieve and its replacement for x^y+/y^x, the more squarelike the search area, the more efficient the software will be. The new software will be even more efficient based upon squareness of the search area. By "squareness" I mean xmax  xmin is close to ymax  ymin. This is because the software has to calculate both x^y mod p and y^x mod p. The software gets its efficiency by computing x^miny mod p for each x and y^minx mod p for each y then multiplying to get the next term. The software won't prohibit a flat rectangle, i.e. minx = maxx or miny = maxy, but it won't be very fast.

20140613, 12:04  #37 
"Mark"
Apr 2003
Between here and the
2^{4}·3·5·29 Posts 
Nevermind my previous post.
I was looking at what I was working on and realized that it couldn't easily be done in a GPU because of the memory requirements. As I turned the problem over in my head (more than a few times) I realized that my approach to the sieve was inefficient. Although more efficient than MultiSieve, there was an obvious optimization that I completely missed. The good thing is that this optimization also simplifies the code (code I haven't written yet) quite a bit. 
20140613, 19:09  #38 
"Mark"
Apr 2003
Between here and the
2^{4}·3·5·29 Posts 
And due to that simplification the code is now written. Now I have the fun of testing it...

20140613, 19:29  #39 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2
19×23^{2} Posts 
I will be glad to help you betatest.
As you can see from my interest for these (which is predating the creation of this forum) that I have a soft spot for these PRPs (as, undoubtedly, also does Selevich). The real burden of course is the PRP testing, but an improved sieve could lift maybe 1/3 of that weight  then, that's great! 
20140615, 13:36  #40  
"Mark"
Apr 2003
Between here and the
2^{4}×3×5×29 Posts 
Quote:


20140630, 12:42  #41 
"Mark"
Apr 2003
Between here and the
15460_{8} Posts 
Almost ready for release. I have to do some more testing. On my laptop (which has a slow GPU), it is only about 2x as fast as MultiSieve. I would expect a desktop with a bigger GPU to be much faster.

20140707, 14:00  #42 
"Mark"
Apr 2003
Between here and the
1B30_{16} Posts 
Here is a 64bit beta (Windows). I have good news and bad news about this beta. First the bad. It appears that MultiSieve removes terms for which it does not have a factor. I've looked at the code and have not been able to figure out where that is occurring. This means that ranges sieved in the past should be sieved again presuming I can't track down why it is doing it, which I have little desire to do.
Now the good. The attached program is about 4x faster than MultiSieve on my laptop (for the range below), which has an NVIDIA NVS 5200M. The performance is somewhat based upon the range of y. The larger the range of y, the better the performance. Here are the runtime options: Code:
E:\stuff\sieves\xyyxsieve_1.0.0\visualstudio>xyyxsievecl64 h h help Print this help v verbose Verbose output (memory usage) q quiet Quiet output (no banner or stats) p pmin=P0 Sieve start: 3 <= P0 <= p (default P0=3) P pmax=P1 Sieve end: p < P1 <= 2^62 (default P1=P0+10^9) t nthreads=N Start N threads (default N=1) l list List available platforms and devices b blocks=B Force B blocks per device f platform=F Use platform F instead of 0 d device=D Use device D instead of 0 x min_x=x Minimum x to search X max_x=X Maximum x to search y min_y=y Minimum y to search Y max_y=Y Maximum y to search s step=s steps iterated per call to GPU f factorsinrate=f Number of factors to use when computing removal rate (default 50) m minimum X Do not report factors smaller than X (default 100000) i inputfile=i PFGW file in ABC format o outputfile=o PFGW file of remaining candidates in ABC format S sign=+//b sign to sieve for You will want to vary options t, s, and b to maximize the performance of your machine. I'll provide source later as this can be built for Linux and OS X. Last fiddled with by rogue on 20140707 at 14:02 
20140707, 14:41  #43 
I moo ablest echo power!
May 2013
73C_{16} Posts 
Runs great on a GTX570, but I did encounter an error as documented below:
Code:
$ xyyxsievecl64.exe x100 X200 y3000 Y10000 P2e6 t2 s3000 S+ oxyyx.pfgw xyyxsievecl v1.0.0, a GPU program to find factors numbers of the form x^y+y^x Sieve started: (cmdline) 0 <= p < 2000000 with 287349 terms OpenCL Error: Out of resources in call to clEnqueueReadBuffer argument: counter 
20140707, 15:05  #44 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2
23503_{8} Posts 
Option f (platform and factorsinrate) is used twice, and doesn't seem to come through (as platform=1)
Code:
$ ./xyyxsievecl64.exe platform=1 d0 x100 X200 y3000 Y4000 P2e6 t2 s3000 S+ b32 oxyyx.pfgw xyyxsievecl v1.0.0, a GPU program to find factors numbers of the form x^y+y^x List of available platforms and devices Platform 0 has no available devices. Here is a list of platforms and devices: Platform 0 is a Advanced Micro Devices, Inc. AMD Accelerated Parallel Processing, version OpenCL 1.2 AMDAPP (938.2) No devices Platform 1 is a NVIDIA Corporation NVIDIA CUDA, version OpenCL 1.1 CUDA 6.0.1 Device 0 is a NVIDIA Corporation GeForce GTX 570 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Distribution of Mersenne primes before and after couples of primes found  emily  Math  35  20221221 16:32 
Mersenne Primes p which are in a set of twin primes is finite?  carpetpool  Miscellaneous Math  4  20220714 02:29 
Leyland Primes: ECPP proofs  Batalov  XYYXF Project  57  20220630 17:24 
On Leyland Primes  davar55  Puzzles  9  20160315 20:55 
possible primes (real primes & poss.prime products)  troels munkner  Miscellaneous Math  4  20060602 08:35 