![]() |
|
|
#56 |
|
Nov 2010
Germany
10010101012 Posts |
Hmm. Hard to believe. Normally, signed/unsigned mismatch is just a warning. It's true that the (boolean) result of a comparison is signed in OpenCL. Therefore I have the ternary there that should provide unsigned long ("<n>UL"). But here, it seems more like there is no way of promoting an unsigned long ("ulong") to a vector ("ulong2"). Normally, that promotion should be implicit - but not here. Even explicit type cast did not work.
Maybe you or kracker could try a few more things, like r2 += ((r1!=0)? (ulong_v)1UL : (ulong_v)0UL); or have it select the size automatically? r2 += ((r1!=0)? 1 : 0); or integrate the addition into the selection: r2 = ((r1!=0)? r2+1 : r2); or ... almost worst case (because threads take a different code path): if (r1!=0) ++r2; or ... really worst case (because this is really a function call for conversion): r2 += convert_##ulong_v((r1!=0)? 1UL : 0UL); I did not try any of those - possibly some won't even work on AMD. If you find something that works, let me know
|
|
|
|
|
|
#57 |
|
"Mr. Meeseeks"
Jan 2012
California, USA
37×59 Posts |
r2 += ((r1!=0)? (ulong_v)1UL : (ulong_v)0UL);
works Do you want -st with that?
|
|
|
|
|
|
#58 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
17×487 Posts |
Errors when building with TRACE_KERNEL=2
This compiler is very picky. |
|
|
|
|
|
#59 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
17·487 Posts |
I've just started to look at the code. My OpenCL knowledge is near zero.
Anyway, this code scared me: Code:
// AS_UINT is applied only to logical results. For vector operations, these are 0 (false) or -1 (true) // For scalar operations, they result in 0 (false) or 1 (true) ==> to unify, negate here #define AS_UINT_V -as_uint Code:
// AS_UINT is applied only to logical results. For vector operations, these are 0 (false) or -1 (true) // For scalar operations, they result in 0 (false) or 1 (true) ==> to unify, negate here #define AS_UINT_V(x) as_uint(-(x)) Last fiddled with by Prime95 on 2013-10-09 at 22:07 |
|
|
|
|
|
#60 | |
|
Nov 2010
Germany
59710 Posts |
Quote:
You are correct with your observation. In the past, I did not care too much about VECTOR_SIZE=1 as it does not fit the AMD GPUs very well. On the other hand, again, signed/unsigned issues are only warnings - apart from a right shift and mul_hi, they don't differ in their operations. I think I use AS_UINT only for additions/subtractions, but I did not verify that. Regarding the TRACE_KERNEL, the compiler is right. For VECTOR_SIZE=1, the variables do not have the vector components. To trace the kernels, set at least VECTOR_SIZE=2. Or modify the trace statements to remove the ".s0" everywhere. I currently have too much other stuff to do to actually dig into that - I hope in two weeks the situation normalizes again. |
|
|
|
|
|
|
#61 |
|
Nov 2010
Germany
3·199 Posts |
|
|
|
|
|
|
#62 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
17·487 Posts |
If I didn't screw things up too badly, I have a trace for bdot.
|
|
|
|
|
|
#63 |
|
Nov 2010
Germany
3·199 Posts |
The calculation looks good - no exceptionally big values, all within bounds. What does look odd is the initial shifted value, bb:
Code:
cl_barrett15_82: bb=fffff968:ffffffff:1e43:dd5b57:dd5bff:0:0:dd5b57:ffffffff:0:ef:216c0e, bit_max65=5 This makes me think that passing this custom type (int180, a struct of 12 uints) does not work and leaves the initial shifted value uninitialized. I have seen this bug in the AMD drivers two years ago and made a workaround at that time. Changelog: Code:
version 0.10 (2011-12-19) - added workaround for compatibility with Catalyst 11.10 and above ... |
|
|
|
|
|
#64 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
827910 Posts |
Sorry, I should have explained what all I did to get it to compile. I commented out the bb initializations and undefined the work around #define. I think it was called something like WA_CATALYST_SOMETHING_OR_OTHER.
The compile problems when tracing are numerous. All the .s0, .s1, etc references are no good. Maybe I'm not working from the latest source? |
|
|
|
|
|
#65 | |
|
Nov 2010
Germany
10010101012 Posts |
Quote:
Regarding .s0, .s1 etc.: I have never before traced with VectorSize=1, when all these values are scalar instead of vectors. Starting with VectorSize=2, the subcomponents need to be addressed by .s0, .s1 ... but I only traced .s0 usually. |
|
|
|
|
|
|
#66 |
|
P90 years forever!
Aug 2002
Yeehaw, FL
17·487 Posts |
|
|
|
|
![]() |
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Can I run my CPU's integrated GPU along with my discrete GPU? | Red Raven | GPU Computing | 9 | 2014-10-24 02:01 |
| New integrated CPU-GPU programming paradigm | Dubslow | GPU Computing | 1 | 2012-02-15 08:45 |
| Ivy Bridge integrated GPU? | Dubslow | GPU Computing | 7 | 2011-11-18 23:36 |
| Can I use integrated graphics alongside a GPU? | mdettweiler | GPU Computing | 9 | 2010-09-15 19:41 |
| turn off your integrated Snd card in CMOS | nngs | Hardware | 0 | 2005-05-20 01:31 |