mersenneforum.org  

Go Back   mersenneforum.org > Extra Stuff > Programming

Reply
 
Thread Tools
Old 2009-11-16, 09:57   #1
xilman
Bamboozled!
 
xilman's Avatar
 
"๐’‰บ๐’ŒŒ๐’‡ท๐’†ท๐’€ญ"
May 2003
Down not across

32×1,303 Posts
Default World's dumbest CUDA program?

Any CUDA people out there who may be able to explain why my GPU code doesn't seem to be able to write to global memory?

Code:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <math.h>

#include <cutil_inline.h>

__global__ void
testKernel(unsigned char *output) 
{
  int k;
  const unsigned tid = blockIdx.x * blockDim.x + threadIdx.x;
  for (k=0; k < 8; k++) output[16*tid+k] = 42;
  for (k=0; k < 8; k++) output[16*tid+k+8] = 66;
}

#define BLOCK_SIZE 2
#define THREADS_PER_BLOCK 2
#define NUM_THREADS (BLOCK_SIZE * THREADS_PER_BLOCK)

#define MEM_SIZE (NUM_THREADS * 8)

void
run_test (int argc, char** argv) 
{
  unsigned char *h_output, *d_output;	/* Host and device memory for output */
  int iter;
  dim3  grid (BLOCK_SIZE, 1, 1);	/* setup execution parameters */
  dim3  threads (THREADS_PER_BLOCK, 1, 1);

  if (cutCheckCmdLineFlag(argc, (const char**)argv, "device"))
    cutilDeviceInit(argc, argv);
  else
    cudaSetDevice (cutGetMaxGflopsDeviceId());

  /* Allocate host memory for output */
  h_output = (unsigned char *) calloc (1, 2 * MEM_SIZE);
  /* Allocate device memory for output */
  cutilSafeCall (cudaMalloc ((void**) &d_output, 2 * MEM_SIZE));

  /* Execute the kernel */
  testKernel <<< grid, threads >>> (d_output);

  /* Check if kernel execution generated an error */
  cutilCheckMsg ("Kernel execution failed");

  cutilSafeCall (cudaThreadSynchronize());	/* Wait for threads to complete. */

  /* Copy results from device to host memory */
  cutilSafeCall (cudaMemcpy (h_output, d_output, 2 & MEM_SIZE,
                                cudaMemcpyDeviceToHost));
  cutilSafeCall (cudaThreadSynchronize());	/* Wait for threads to complete. */

  for (iter = 0; iter < 2*MEM_SIZE; iter++) {
    printf ("%d %02x\n", iter, h_output[iter]);
  }
  free (h_output);
  cutilSafeCall (cudaFree (d_output));
  cudaThreadExit ();
}

int main (int argc, char** argv) 
{
  run_test (argc, argv);
  cutilExit(argc, argv);
}
Environment: 64-bit RedHat EL5.2; Tesla C1060; fresh install of CUDA 2.3; all SDK projects built without error and a representative selection run correctly.

This test case was developed from the SDK template project then stripped down pretty much to bare-bones. It allocates global memory on host and device, calls a kernel to write constant non-zero bytes then prints what's happened, if anything. On my system it invariably prints zeros.

A kernel which does significant computation takes significant time to run, implying that the kernel is being called, but still doesn't write to global memory.

The fact that everything works except my code suggests that I have a conceptual error rather than a system bug.


Paul
xilman is offline   Reply With Quote
Old 2009-11-16, 10:26   #2
xilman
Bamboozled!
 
xilman's Avatar
 
"๐’‰บ๐’ŒŒ๐’‡ท๐’†ท๐’€ญ"
May 2003
Down not across

32×1,303 Posts
Default

Quote:
Originally Posted by xilman View Post
Code:
  /* Copy results from device to host memory */
  cutilSafeCall (cudaMemcpy (h_output, d_output, 2 & MEM_SIZE,
                                cudaMemcpyDeviceToHost));
The fact that everything works except my code suggests that I have a conceptual error rather than a system bug.
Found the problem --- a simple and silly typo. The '&' key is next to the '*' key on my keyboard...


Paul
xilman is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfaktc: a CUDA program for Mersenne prefactoring TheJudger GPU Computing 3625 2023-03-30 00:08
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51
End of the world as we know it (in music) firejuggler Lounge 3 2012-12-22 01:43
World Cup Soccer davieddy Hobbies 111 2011-05-28 19:21

All times are UTC. The time now is 23:13.


Thu Mar 30 23:13:44 UTC 2023 up 224 days, 20:42, 0 users, load averages: 1.02, 0.90, 0.81

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

โ‰  ยฑ โˆ“ รท ร— ยท โˆ’ โˆš โ€ฐ โŠ— โŠ• โŠ– โŠ˜ โŠ™ โ‰ค โ‰ฅ โ‰ฆ โ‰ง โ‰จ โ‰ฉ โ‰บ โ‰ป โ‰ผ โ‰ฝ โŠ โА โŠ‘ โŠ’ ยฒ ยณ ยฐ
โˆ  โˆŸ ยฐ โ‰… ~ โ€– โŸ‚ โซ›
โ‰ก โ‰œ โ‰ˆ โˆ โˆž โ‰ช โ‰ซ โŒŠโŒ‹ โŒˆโŒ‰ โˆ˜ โˆ โˆ โˆ‘ โˆง โˆจ โˆฉ โˆช โจ€ โŠ• โŠ— ๐–• ๐–– ๐–— โŠฒ โŠณ
โˆ… โˆ– โˆ โ†ฆ โ†ฃ โˆฉ โˆช โІ โŠ‚ โŠ„ โŠŠ โЇ โŠƒ โŠ… โŠ‹ โŠ– โˆˆ โˆ‰ โˆ‹ โˆŒ โ„• โ„ค โ„š โ„ โ„‚ โ„ต โ„ถ โ„ท โ„ธ ๐“Ÿ
ยฌ โˆจ โˆง โŠ• โ†’ โ† โ‡’ โ‡ โ‡” โˆ€ โˆƒ โˆ„ โˆด โˆต โŠค โŠฅ โŠข โŠจ โซค โŠฃ โ€ฆ โ‹ฏ โ‹ฎ โ‹ฐ โ‹ฑ
โˆซ โˆฌ โˆญ โˆฎ โˆฏ โˆฐ โˆ‡ โˆ† ฮด โˆ‚ โ„ฑ โ„’ โ„“
๐›ข๐›ผ ๐›ฃ๐›ฝ ๐›ค๐›พ ๐›ฅ๐›ฟ ๐›ฆ๐œ€๐œ– ๐›ง๐œ ๐›จ๐œ‚ ๐›ฉ๐œƒ๐œ— ๐›ช๐œ„ ๐›ซ๐œ… ๐›ฌ๐œ† ๐›ญ๐œ‡ ๐›ฎ๐œˆ ๐›ฏ๐œ‰ ๐›ฐ๐œŠ ๐›ฑ๐œ‹ ๐›ฒ๐œŒ ๐›ด๐œŽ๐œ ๐›ต๐œ ๐›ถ๐œ ๐›ท๐œ™๐œ‘ ๐›ธ๐œ’ ๐›น๐œ“ ๐›บ๐œ”