mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > XYYXF Project

Reply
 
Thread Tools
Old 2014-07-07, 15:17   #45
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

10110110110102 Posts
Default

Quote:
Originally Posted by wombatman View Post
Runs great on a GTX570, but I did encounter an error as documented below:
Code:
$ xyyxsievecl64.exe -x100 -X200 -y3000 -Y10000 -P2e6 -t2 -s3000 -S+ -oxyyx.pfgw

xyyxsievecl v1.0.0, a GPU program to find factors numbers of the form x^y+y^x
Sieve started: (cmdline) 0 <= p < 2000000 with 287349 terms
OpenCL Error: Out of resources
       in call to clEnqueueReadBuffer
       argument: counter
My GPU has ~1.2GB of RAM and usually about 1GB actually available.
I have seen that as well. I haven't figured it out yet. The program doesn't use that much global GPU memory so I suspect it is a problem with local GPU memory. I have a table of powers for each kernel. I'll reduce the size of that table as it has too many zeros in it. Hopefully that will help. One other issue is that the number of factors returned might be causing problems. It could be overflowing in the GPU. I'm not certain how to avoid that without losing useful information.

As for two "-f" options, I modified factorsinrate to use -F.

If anyone can provide some speed comparisons with MultiSieve that would be nice.

I'm also working on a change to pfgw so that the factors output from sieving programs can be easily verified. I though I had done something like that already, but maybe I never released that code.

Last fiddled with by rogue on 2014-07-07 at 15:43
rogue is offline   Reply With Quote
Old 2014-07-07, 17:38   #46
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

2·32·52·13 Posts
Default

The "Out of resources" seems to have been caused by having too many small factors. I'll do two things to address that. First, I'll increase the buffer size so that more factors can be collected. (I've also added a check so the program can tell you if you missed factors due to too many being found.) Second, I've added a "small primes" check to eliminate all terms that are factored by p < 100 before I start the GPU sieve. That can eliminate 75% of the terms up front.
rogue is offline   Reply With Quote
Old 2014-07-07, 18:36   #47
wombatman
I moo ablest echo power!
 
wombatman's Avatar
 
May 2013

32·193 Posts
Default

Excellent!
wombatman is offline   Reply With Quote
Old 2014-07-07, 19:55   #48
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

2·32·52·13 Posts
Default

Quote:
Originally Posted by wombatman View Post
Excellent!
By "Excellent!", do you mean this:
Attached Thumbnails
Click image for larger version

Name:	mrburns.jpg
Views:	61
Size:	18.2 KB
ID:	11439  
rogue is offline   Reply With Quote
Old 2014-07-07, 19:56   #49
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

16DA16 Posts
Default

Quote:
Originally Posted by wombatman View Post
Excellent!
or this?
Attached Thumbnails
Click image for larger version

Name:	billnted.jpg
Views:	65
Size:	17.4 KB
ID:	11440  
rogue is offline   Reply With Quote
Old 2014-07-07, 20:12   #50
wombatman
I moo ablest echo power!
 
wombatman's Avatar
 
May 2013

32·193 Posts
Default

Why not both?
wombatman is offline   Reply With Quote
Old 2014-07-07, 20:47   #51
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

10110110110102 Posts
Default

Attached is the latest code. Here is what the output looks like now. This is a rather extreme range to do. It still fails if the range gets larger. I know why, but fixing is a bigger challenge. It is probably better to take smaller ranges and paste together one file and restart from there, although I haven't test the ability to restart the sieve from the output file.

Code:
xyyxsievecl64.exe -x2 -X2500 -y3 -Y7500 -P1e6 -t2 -s2000 -S+ -oxyyx.pfgw
xyyxsievecl v1.0.1, a GPU program to find factors numbers of the form x^y+y^x
Quick elimination of terms info (in order of check):
    7807501 because the term is even
    1478788 because x and y have a common divisor
    4472597 because the term is divisible by a prime < 100
Sieve started: (cmdline) 0 <= p < 1000000 with 1857365 terms
Thread 2 has completed 1818380 of 1862363 iterations
CTRL-C accepted.  Please wait for threads to completed.s/factor, 0.15 CPU cores, 25.6% done. ETA 07 Jul 15:53

Thread 2 has completed 383059 of 390048 iterations
Sieve interrupted: 3 <= p < 1000000  26624 primes tested
Clock time: 539.27 seconds at at 49 p/sec.  Factors found: 1472315
Processor time: 92.09 sec. (8.53 init + 83.55 sieve).
Seconds spent in CPU and GPU: 65.70 (cpu), 1044.80 (gpu)
Percent of time spent in CPU vs. GPU: 5.92 (cpu), 94.08 (gpu)
CPU/GPU utilization: 0.17 (cores), 1.00 (devices)
Percent of GPU time waiting for GPU: 46.87
Started with 1857365 terms and sieved to 281719.  385050 remaining terms written to xyyx.pfgw
Attached Files
File Type: zip xyyxsievecl64.zip (51.2 KB, 62 views)
rogue is offline   Reply With Quote
Old 2014-07-07, 20:56   #52
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

2·32·52·13 Posts
Default

I think I just found another "Out of resources" trigger. Working on it.
rogue is offline   Reply With Quote
Old 2014-07-07, 22:05   #53
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

10110110110102 Posts
Default

I fixed that one, but I still have a problem elsewhere. It is only an issue when trying to do obscenely large search spaces. I tried to implement different logic to collect factors, but my video card does not support the OpenCL functcion atom_min.
rogue is offline   Reply With Quote
Old 2014-07-07, 23:44   #54
wombatman
I moo ablest echo power!
 
wombatman's Avatar
 
May 2013

32·193 Posts
Default

No problem. Like you said, smaller ranges work fine. I just wanted to point it out in case it was a larger issue than that. Very impressive work!
wombatman is offline   Reply With Quote
Old 2014-07-08, 12:04   #55
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

585010 Posts
Default

Thanks. This was a very difficult program to optimize. If the range of x is large you think of optimizing in one direction. If the range of y is large, then you think of optimizing in a different direction. If both are large, then you think of a way to get the best of both worlds. Unfortunately memory and latency quickly get in the way of most potential optimizations.

Due to the bug in MultiSieve, I'm going to sieve x^y+y^x for x and y < 10000. I will also update PRPNet to support an x^y+y^x search. I don't know if I'll be able to run every term (as there will probably be close to a million after sieving), but since all of them are small it might be doable.
rogue is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Leyland Primes: ECPP proofs Batalov XYYXF Project 16 2019-08-04 00:32
Mersenne Primes p which are in a set of twin primes is finite? carpetpool Miscellaneous Math 3 2017-08-10 13:47
Distribution of Mersenne primes before and after couples of primes found emily Math 34 2017-07-16 18:44
On Leyland Primes davar55 Puzzles 9 2016-03-15 20:55
possible primes (real primes & poss.prime products) troels munkner Miscellaneous Math 4 2006-06-02 08:35

All times are UTC. The time now is 18:22.

Sat Aug 8 18:22:55 UTC 2020 up 22 days, 14:09, 1 user, load averages: 1.43, 1.50, 1.54

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.