![]() |
|
|
#34 |
|
Nov 2010
Germany
3×199 Posts |
Cool, we already got everything prepared, just the OpenCL code does not fit the Intel compiler.
Please use the two attached barrett files to replace the ones in the mfakto folder. I hope I added enough type-casts to satisfy the Intel compiler. If not, then feel free to add more (to reduce the turnaround times). Once we got that running, it might be useful to check, if that platform runs mad24() at least at the same speed as mad() - I know that NVIDIA needs to add extra operations to mask out the upper 8 bits of the operands ... Last fiddled with by Bdot on 2013-10-03 at 09:28 Reason: typo |
|
|
|
|
|
#35 |
|
"Mr. Meeseeks"
Jan 2012
California, USA
218310 Posts |
Done. (Hmm..)
|
|
|
|
|
|
#36 |
|
Nov 2010
Germany
25516 Posts |
Interesting. 4 tests succeeded!
So now it loads and compiles fine, just the code is failing ![]() To see if it's the sieve or the tf, could you please run with SieveOnGPU=0 Another test: VectorSize=1 And I'd like to see the output of 2-3 minutes of mfakto -st (best would be with CPU sieving). |
|
|
|
|
|
#37 | |
|
"Mr. Meeseeks"
Jan 2012
California, USA
37·59 Posts |
Quote:
|
|
|
|
|
|
|
#38 |
|
Nov 2010
Germany
3·199 Posts |
Assuming it is the TF that actually fails, the next step is to edit mfakto_Kernels.cl, line 45
#define TRACE_KERNEL 0 Change that to 2 at first - later 3 or 4 will be needed. Run that with SieveOnGPU=1 VectorSize=1 Use the same settings for your AMD GPU and compare the output. There should be no differences at TRACE level 2. Higher levels also trace the intermediate results which may differ due to different rounding. I guess that at some point, TRACE level 2 also shows differences. These need to be examined in the higher-level traces ... If you send me the Intel-output of level 2, 3 and 4 (just 1 minute each), I can do the comparing and searching myself. The fact that 4 test cases were successful makes me think that not all is lost
|
|
|
|
|
|
#39 |
|
May 2013
East. Always East.
11×157 Posts |
I don't suppose there would be an advantage to delegating a certain part of the whole process to the iGPU and the rest to a discrete GPU? Or to the actual cores of the processor?
|
|
|
|
|
|
#40 |
|
"Mr. Meeseeks"
Jan 2012
California, USA
37×59 Posts |
|
|
|
|
|
|
#41 |
|
"Mr. Meeseeks"
Jan 2012
California, USA
37×59 Posts |
Hmmmm.
With GPU sieving off and VectorSize1: Code:
number of tests 117 successful tests 53 no factor found 64 |
|
|
|
|
|
#42 |
|
"Mr. Meeseeks"
Jan 2012
California, USA
37·59 Posts |
|
|
|
|
|
|
#43 |
|
May 2013
East. Always East.
11×157 Posts |
What I meant was a while back the CPU did part of the work (sieving if I recall) while the GPU did the rest. Later, it was found that the GPU could do that as well. Is it possible that the iGPU might do some part of the work better or worse than another part which could be delegated to a different piece of hardware?
I.e. sieve on iGPU and TF on proper GPU if there is one available? |
|
|
|
|
|
#44 | |
|
Nov 2010
Germany
3·199 Posts |
Quote:
Edit: sieving on one GPU and TF on the other is less likely to be efficient as the speeds would need to adjust to each other, leading to the same issues that CPU sieving has: the slower part may not keep up, wasting the faster one's resources. Last fiddled with by Bdot on 2013-10-04 at 18:59 Reason: forgot one part |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Can I run my CPU's integrated GPU along with my discrete GPU? | Red Raven | GPU Computing | 9 | 2014-10-24 02:01 |
| New integrated CPU-GPU programming paradigm | Dubslow | GPU Computing | 1 | 2012-02-15 08:45 |
| Ivy Bridge integrated GPU? | Dubslow | GPU Computing | 7 | 2011-11-18 23:36 |
| Can I use integrated graphics alongside a GPU? | mdettweiler | GPU Computing | 9 | 2010-09-15 19:41 |
| turn off your integrated Snd card in CMOS | nngs | Hardware | 0 | 2005-05-20 01:31 |