![]() |
Your quote says that George is right. "If the right operand is... greater than or equal to the length in bits of the promoted left operand, the result is undefined." Which is exactly what George said.
|
[QUOTE=Dubslow;332342]Your quote says that George is right. "If the right operand is... greater than or equal to the length in bits of the promoted left operand, the result is undefined." Which is exactly what George said.[/QUOTE]
LOL... Never go up against George... You'll lose. :smile: |
@Chriss: Haha, no coffee? Happens to me very often when I post before my morning coffee :razz:
|
GPU sieve update
some progress at last:
After spending days to work around an OpenCL compiler abort, I finally got something to work ... to get some idea about it on AMD cards. It still finds only 10% of the selftest factors, and a couple of quirks may still slow it down. [LIST][*](almost) no CPU usage, and less than 1% performance drop when running prime95 on all cores (I was curious about this because with the CPU-sieve mfakto version you have to keep one core idle)[*]~100 GHz-days/day on HD5770 (James lists the card at 75.6, but I run it at 150 using 3 CPU cores) (all adjusted for default-clock).[/LIST]I'll see I can test it on GCN tomorrow. And fix the missing factors :smile: |
[QUOTE=Bdot;334103]some progress at last:
After spending days to work around an OpenCL compiler abort, I finally got something to work ... to get some idea about it on AMD cards. It still finds only 10% of the selftest factors, and a couple of quirks may still slow it down. [LIST][*](almost) no CPU usage, and less than 1% performance drop when running prime95 on all cores (I was curious about this because with the CPU-sieve mfakto version you have to keep one core idle)[*]~100 GHz-days/day on HD5770 (James lists the card at 75.6, but I run it at 150 using 3 CPU cores) (all adjusted for default-clock).[/LIST]I'll see I can test it on GCN tomorrow. And fix the missing factors :smile:[/QUOTE] :toot: Very nice! As always, if you need testers... :smile: |
[QUOTE=kracker;334105]:toot: Very nice! As always, if you need testers... :smile:[/QUOTE]
Thanks, I'll certainly come back to that, after I fixed the errors I found so far ... The GPU sieving itself delivers the correct result, so either I have some mismatch with the number of threads, or the bit counting, or shared memory synchronization. I'll find it. The GCN test on HD 7850 with the same version: mfakto-GPU: 155 GHz-days/day, [URL="http://www.mersenne.ca/mfaktc.php"]James[/URL]: 153 GHz-days/day, mfakto-CPU: 180 GHz-days/day (3 CPU cores) |
1 Attachment(s)
[QUOTE=Bdot;334396]Thanks, I'll certainly come back to that, after I fixed the errors I found so far ...
The GPU sieving itself delivers the correct result, so either I have some mismatch with the number of threads, or the bit counting, or shared memory synchronization. I'll find it. The GCN test on HD 7850 with the same version: mfakto-GPU: 155 GHz-days/day, [URL="http://www.mersenne.ca/mfaktc.php"]James[/URL]: 153 GHz-days/day, mfakto-CPU: 180 GHz-days/day (3 CPU cores)[/QUOTE] I see. Right now though... I have a cpu bottleneck(well, always had) :bangheadonwall: |
Hmm...
[URL="http://www.tomshardware.com/news/OpenCL-Intel-HD-Graphics-Core,21844.html"]http://www.tomshardware.com/news/OpenCL-Intel-HD-Graphics-Core,21844.html[/URL] I have 3 2500's. |
Beta (or alpha) testers for AMD GPU sieve?
[QUOTE=kracker;336075]Hmm...
[URL]http://www.tomshardware.com/news/OpenCL-Intel-HD-Graphics-Core,21844.html[/URL] I have 3 2500's.[/QUOTE] Hihi, lets see, if that'll work ... later. Before that, I'd like to tell that I'm getting close to a pre-pre-version of the GPU sieve on OpenCL. Only one kernel (64-77 bit factor size) so far, fix vector size, and barely functional (i.e. room for performance-improvements). I'm looking for AMD-GPU owners who are willing to "waste" a few GHzdays by trying to rediscover a few factors in a complete run, as well as testing out the available settings, finding optimal values etc. As of today, the GPU sieve missed only ~70 of ~15000 factors I gave it in an extended self-test. Barely enough misses to hide "the only remaining bug" :smile:. I hope to fix that by the weekend, and would then send out the prototype. If you're willing to join, please let me know the GPU and OS you need it for as well as your email address (PM accepted :smile:). Thanks for your help, Bdot |
[QUOTE=kracker;336075]Hmm...
[URL="http://www.tomshardware.com/news/OpenCL-Intel-HD-Graphics-Core,21844.html"]http://www.tomshardware.com/news/OpenCL-Intel-HD-Graphics-Core,21844.html[/URL] I have 3 2500's.[/QUOTE] Still nothing for Linux... :sad: Luigi |
mfakto 0.13-pre3 being tested
Finally, the the "very last" bug was in the GPU sieving code itself. I almost issued a warning for mfaktc, but it was also a self-made one in my attempt to imitate the CUDA 64-bit shifts, something like this:
mask = i67 > 64 ? 0 : ((ulong) 1 << i67); :redface: So no problem for mfaktc found during my porting efforts. I'm just happy we have enough test cases so that this one was discovered. Now, that everything is working for one kernel, I'll start porting the others. And I'll check out a few alternative implementations for performance. Let's see what feedback I receive from the testers. In case it is already worth releasing it, I may move the optimizations to a later version. |
| All times are UTC. The time now is 23:07. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.