mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2013-03-08, 01:03   #705
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

1C3516 Posts
Default

Your quote says that George is right. "If the right operand is... greater than or equal to the length in bits of the promoted left operand, the result is undefined." Which is exactly what George said.
Dubslow is offline   Reply With Quote
Old 2013-03-08, 01:10   #706
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

9,767 Posts
Default

Quote:
Originally Posted by Dubslow View Post
Your quote says that George is right. "If the right operand is... greater than or equal to the length in bits of the promoted left operand, the result is undefined." Which is exactly what George said.
LOL... Never go up against George... You'll lose.
chalsall is online now   Reply With Quote
Old 2013-03-08, 03:15   #707
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

72·197 Posts
Default

@Chriss: Haha, no coffee? Happens to me very often when I post before my morning coffee
LaurV is offline   Reply With Quote
Old 2013-03-19, 23:51   #708
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

11258 Posts
Default GPU sieve update

some progress at last:

After spending days to work around an OpenCL compiler abort, I finally got something to work ... to get some idea about it on AMD cards. It still finds only 10% of the selftest factors, and a couple of quirks may still slow it down.

  • (almost) no CPU usage, and less than 1% performance drop when running prime95 on all cores (I was curious about this because with the CPU-sieve mfakto version you have to keep one core idle)
  • ~100 GHz-days/day on HD5770 (James lists the card at 75.6, but I run it at 150 using 3 CPU cores) (all adjusted for default-clock).
I'll see I can test it on GCN tomorrow. And fix the missing factors

Last fiddled with by Bdot on 2013-03-19 at 23:52 Reason: wording
Bdot is offline   Reply With Quote
Old 2013-03-19, 23:54   #709
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23×271 Posts
Default

Quote:
Originally Posted by Bdot View Post
some progress at last:

After spending days to work around an OpenCL compiler abort, I finally got something to work ... to get some idea about it on AMD cards. It still finds only 10% of the selftest factors, and a couple of quirks may still slow it down.

  • (almost) no CPU usage, and less than 1% performance drop when running prime95 on all cores (I was curious about this because with the CPU-sieve mfakto version you have to keep one core idle)
  • ~100 GHz-days/day on HD5770 (James lists the card at 75.6, but I run it at 150 using 3 CPU cores) (all adjusted for default-clock).
I'll see I can test it on GCN tomorrow. And fix the missing factors
Very nice! As always, if you need testers...
kracker is offline   Reply With Quote
Old 2013-03-21, 19:47   #710
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3·199 Posts
Default

Quote:
Originally Posted by kracker View Post
Very nice! As always, if you need testers...
Thanks, I'll certainly come back to that, after I fixed the errors I found so far ...
The GPU sieving itself delivers the correct result, so either I have some mismatch with the number of threads, or the bit counting, or shared memory synchronization. I'll find it.

The GCN test on HD 7850 with the same version:
mfakto-GPU: 155 GHz-days/day,
James: 153 GHz-days/day,
mfakto-CPU: 180 GHz-days/day (3 CPU cores)
Bdot is offline   Reply With Quote
Old 2013-03-21, 20:08   #711
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23·271 Posts
Default

Quote:
Originally Posted by Bdot View Post
Thanks, I'll certainly come back to that, after I fixed the errors I found so far ...
The GPU sieving itself delivers the correct result, so either I have some mismatch with the number of threads, or the bit counting, or shared memory synchronization. I'll find it.

The GCN test on HD 7850 with the same version:
mfakto-GPU: 155 GHz-days/day,
James: 153 GHz-days/day,
mfakto-CPU: 180 GHz-days/day (3 CPU cores)
I see. Right now though... I have a cpu bottleneck(well, always had)
Attached Thumbnails
Click image for larger version

Name:	sf.jpg
Views:	118
Size:	69.4 KB
ID:	9586  
kracker is offline   Reply With Quote
Old 2013-04-04, 14:41   #712
kracker
 
kracker's Avatar
 
"Mr. Meeseeks"
Jan 2012
California, USA

23×271 Posts
Default

Hmm...

http://www.tomshardware.com/news/Ope...ore,21844.html

I have 3 2500's.
kracker is offline   Reply With Quote
Old 2013-04-04, 15:40   #713
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3·199 Posts
Default Beta (or alpha) testers for AMD GPU sieve?

Quote:
Originally Posted by kracker View Post
Hihi, lets see, if that'll work ... later.

Before that, I'd like to tell that I'm getting close to a pre-pre-version of the GPU sieve on OpenCL. Only one kernel (64-77 bit factor size) so far, fix vector size, and barely functional (i.e. room for performance-improvements).

I'm looking for AMD-GPU owners who are willing to "waste" a few GHzdays by trying to rediscover a few factors in a complete run, as well as testing out the available settings, finding optimal values etc.

As of today, the GPU sieve missed only ~70 of ~15000 factors I gave it in an extended self-test. Barely enough misses to hide "the only remaining bug" . I hope to fix that by the weekend, and would then send out the prototype.

If you're willing to join, please let me know the GPU and OS you need it for as well as your email address (PM accepted ).

Thanks for your help,
Bdot

Last fiddled with by Bdot on 2013-04-04 at 15:43 Reason: and email
Bdot is offline   Reply With Quote
Old 2013-04-04, 15:57   #714
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

10010110100102 Posts
Default

Quote:
Originally Posted by kracker View Post
Still nothing for Linux...

Luigi
ET_ is offline   Reply With Quote
Old 2013-04-07, 15:31   #715
Bdot
 
Bdot's Avatar
 
Nov 2010
Germany

3·199 Posts
Default mfakto 0.13-pre3 being tested

Finally, the the "very last" bug was in the GPU sieving code itself. I almost issued a warning for mfaktc, but it was also a self-made one in my attempt to imitate the CUDA 64-bit shifts, something like this:

mask = i67 > 64 ? 0 : ((ulong) 1 << i67);

So no problem for mfaktc found during my porting efforts. I'm just happy we have enough test cases so that this one was discovered.

Now, that everything is working for one kernel, I'll start porting the others. And I'll check out a few alternative implementations for performance.

Let's see what feedback I receive from the testers. In case it is already worth releasing it, I may move the optimizations to a later version.
Bdot is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
gpuOwL: an OpenCL program for Mersenne primality testing preda GpuOwl 2718 2021-07-06 18:30
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3497 2021-06-05 12:27
LL with OpenCL msft GPU Computing 433 2019-06-23 21:11
OpenCL for FPGAs TObject GPU Computing 2 2013-10-12 21:09
Program to TF Mersenne numbers with more than 1 sextillion digits? Stargate38 Factoring 24 2011-11-03 00:34

All times are UTC. The time now is 03:16.


Mon Aug 2 03:16:31 UTC 2021 up 9 days, 21:45, 0 users, load averages: 0.90, 1.20, 1.32

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.