mersenneforum.org Leyland Primes (x^y+y^x primes)
 Register FAQ Search Today's Posts Mark Forums Read

2014-06-03, 09:12   #34
XYYXF

Jan 2005
Minsk, Belarus

24×52 Posts

Quote:
 Originally Posted by rogue Multisieve also supports x^y-y^x and for that to be > 0, x must be less than y.
I've always used version 5.5 which requires x > y :)

http://www.primefan.ru/stuff/soft/ms.zip

2014-06-03, 11:53   #35
rogue

"Mark"
Apr 2003
Between here and the

22×3×523 Posts

Quote:
 Originally Posted by XYYXF I've always used version 5.5 which requires x > y :) http://www.primefan.ru/stuff/soft/ms.zip
That was before I was asked to modify MultiSieve to support x^y - y^x.

 2014-06-12, 17:31 #36 rogue     "Mark" Apr 2003 Between here and the 22·3·523 Posts With both MultiSieve and its replacement for x^y+/-y^x, the more squarelike the search area, the more efficient the software will be. The new software will be even more efficient based upon squareness of the search area. By "squareness" I mean xmax - xmin is close to ymax - ymin. This is because the software has to calculate both x^y mod p and y^x mod p. The software gets its efficiency by computing x^miny mod p for each x and y^minx mod p for each y then multiplying to get the next term. The software won't prohibit a flat rectangle, i.e. minx = maxx or miny = maxy, but it won't be very fast.
 2014-06-13, 12:04 #37 rogue     "Mark" Apr 2003 Between here and the 22×3×523 Posts Nevermind my previous post. I was looking at what I was working on and realized that it couldn't easily be done in a GPU because of the memory requirements. As I turned the problem over in my head (more than a few times) I realized that my approach to the sieve was inefficient. Although more efficient than MultiSieve, there was an obvious optimization that I completely missed. The good thing is that this optimization also simplifies the code (code I haven't written yet) quite a bit.
 2014-06-13, 19:09 #38 rogue     "Mark" Apr 2003 Between here and the 22·3·523 Posts And due to that simplification the code is now written. Now I have the fun of testing it...
 2014-06-13, 19:29 #39 Batalov     "Serge" Mar 2008 Phi(4,2^7658614+1)/2 24AC16 Posts I will be glad to help you beta-test. As you can see from my interest for these (which is predating the creation of this forum) that I have a soft spot for these PRPs (as, undoubtedly, also does Selevich). The real burden of course is the PRP testing, but an improved sieve could lift maybe 1/3 of that weight - then, that's great!
2014-06-15, 13:36   #40
rogue

"Mark"
Apr 2003
Between here and the

188416 Posts

Quote:
 Originally Posted by rogue Nevermind my previous post. I was looking at what I was working on and realized that it couldn't easily be done in a GPU because of the memory requirements. As I turned the problem over in my head (more than a few times) I realized that my approach to the sieve was inefficient. Although more efficient than MultiSieve, there was an obvious optimization that I completely missed. The good thing is that this optimization also simplifies the code (code I haven't written yet) quite a bit.
That optimization was a fail (and I know why). So I'm focused on getting the code at least equivalent to MultiSieve and then focus on optimizations later. That being said it shouldn't be too difficult to make that happen.

 2014-06-30, 12:42 #41 rogue     "Mark" Apr 2003 Between here and the 627610 Posts Almost ready for release. I have to do some more testing. On my laptop (which has a slow GPU), it is only about 2x as fast as MultiSieve. I would expect a desktop with a bigger GPU to be much faster.
2014-07-07, 14:00   #42
rogue

"Mark"
Apr 2003
Between here and the

22×3×523 Posts

Here is a 64-bit beta (Windows). I have good news and bad news about this beta. First the bad. It appears that MultiSieve removes terms for which it does not have a factor. I've looked at the code and have not been able to figure out where that is occurring. This means that ranges sieved in the past should be sieved again presuming I can't track down why it is doing it, which I have little desire to do.

Now the good. The attached program is about 4x faster than MultiSieve on my laptop (for the range below), which has an NVIDIA NVS 5200M. The performance is somewhat based upon the range of y. The larger the range of y, the better the performance.

Here are the runtime options:

Code:
E:\stuff\sieves\xyyxsieve_1.0.0\visualstudio>xyyxsievecl64 -h
-h --help          Print this help
-v --verbose       Verbose output (memory usage)
-q --quiet         Quiet output (no banner or stats)
-p --pmin=P0       Sieve start: 3 <= P0 <= p (default P0=3)
-P --pmax=P1       Sieve end: p < P1 <= 2^62 (default P1=P0+10^9)
-l --list          List available platforms and devices
-b --blocks=B      Force B blocks per device
-f --platform=F    Use platform F instead of 0
-d --device=D      Use device D instead of 0
-x --min_x=x          Minimum x to search
-X --max_x=X          Maximum x to search
-y --min_y=y          Minimum y to search
-Y --max_y=Y          Maximum y to search
-s --step=s           steps iterated per call to GPU
-f --factorsinrate=f  Number of factors to use when computing removal rate (default 50)
-m --minimum X        Do not report factors smaller than X (default 100000)
-i --inputfile=i      PFGW file in ABC format
-o --outputfile=o     PFGW file of remaining candidates in ABC format
-S --sign=+/-/b       sign to sieve for
Here is an command line example: xyyxsievecl64 -x100 -X200 -y3000 -Y4000 -P2e6 -t2 -s3000 -S+ -b32 -oxyyx.pfgw

You will want to vary options t, s, and b to maximize the performance of your machine.

I'll provide source later as this can be built for Linux and OS X.
Attached Files
 xyyxsievecl64.zip (50.3 KB, 160 views)

Last fiddled with by rogue on 2014-07-07 at 14:02

 2014-07-07, 14:41 #43 wombatman I moo ablest echo power!     May 2013 1,741 Posts Runs great on a GTX570, but I did encounter an error as documented below: Code: $xyyxsievecl64.exe -x100 -X200 -y3000 -Y10000 -P2e6 -t2 -s3000 -S+ -oxyyx.pfgw xyyxsievecl v1.0.0, a GPU program to find factors numbers of the form x^y+y^x Sieve started: (cmdline) 0 <= p < 2000000 with 287349 terms OpenCL Error: Out of resources in call to clEnqueueReadBuffer argument: counter My GPU has ~1.2GB of RAM and usually about 1GB actually available.  2014-07-07, 15:05 #44 Batalov "Serge" Mar 2008 Phi(4,2^7658614+1)/2 22×2,347 Posts Option -f (platform and factorsinrate) is used twice, and doesn't seem to come through (as --platform=1) Code: $ ./xyyxsievecl64.exe --platform=1 -d0 -x100 -X200 -y3000 -Y4000 -P2e6 -t2 -s3000 -S+ -b32 -oxyyx.pfgw xyyxsievecl v1.0.0, a GPU program to find factors numbers of the form x^y+y^x List of available platforms and devices Platform 0 has no available devices. Here is a list of platforms and devices: Platform 0 is a Advanced Micro Devices, Inc. AMD Accelerated Parallel Processing, version OpenCL 1.2 AMD-APP (938.2) No devices Platform 1 is a NVIDIA Corporation NVIDIA CUDA, version OpenCL 1.1 CUDA 6.0.1 Device 0 is a NVIDIA Corporation GeForce GTX 570

 Similar Threads Thread Thread Starter Forum Replies Last Post Batalov XYYXF Project 16 2019-08-04 00:32 carpetpool Miscellaneous Math 3 2017-08-10 13:47 emily Math 34 2017-07-16 18:44 davar55 Puzzles 9 2016-03-15 20:55 troels munkner Miscellaneous Math 4 2006-06-02 08:35

All times are UTC. The time now is 04:35.

Mon Apr 12 04:35:46 UTC 2021 up 3 days, 23:16, 1 user, load averages: 1.47, 1.67, 1.89