3.3 First OpenCL Program

From this section onward, we will start learning the OpenCL programming basics by building and running actual code. Since we have not yet gone over the OpenCL grammar, you should concentrate on the general flow of OpenCL programming.

Hello World

List 3.3 and 3.4 shows the familiar "Hello, World!" program, written in OpenCL. Since standard in/out cannot be used within the kernel, we will use the kernel only to set the char array. In this program, the string set on the kernel will be copied over to the host side, which can then be outputted. (The code can be downloaded from http://www.fixstars.com/books/opencl)

List 3.3: Hello World - kernel (hello.cl)

__kernel void hello(__global char* string)
{
string[0] = 'H';
string[1] = 'e';
string[2] = 'l';
string[3] = 'l';
string[4] = 'o';
string[5] = ',';
string[6] = ' ';
string[7] = 'W';
string[8] = 'o';
string[9] = 'r';
string[10] = 'l';
string[11] = 'd';
string[12] = '!';
string[13] = '\0';
}

List 3.4: Hello World - host (hello.c)

#include <stdio.h>
#include <stdlib.h>

#ifdef __APPLE__
#include <OpenCL/opencl.h>
#else
#include <CL/cl.h>
#endif

#define MEM_SIZE (128)
#define MAX_SOURCE_SIZE (0x100000)

int main()
{
cl_device_id device_id = NULL;
cl_context context = NULL;
cl_command_queue command_queue = NULL;
cl_mem memobj = NULL;
cl_program program = NULL;
cl_kernel kernel = NULL;
cl_platform_id platform_id = NULL;
cl_uint ret_num_devices;
cl_uint ret_num_platforms;
cl_int ret;

char string[MEM_SIZE];

FILE *fp;
char fileName[] = "./hello.cl";
char *source_str;
size_t source_size;

/* Load the source code containing the kernel*/
fp = fopen(fileName, "r");
if (!fp) {
fprintf(stderr, "Failed to load kernel.\n");
exit(1);
}
source_str = (char*)malloc(MAX_SOURCE_SIZE);
source_size = fread(source_str, 1, MAX_SOURCE_SIZE, fp);
fclose(fp);

/* Get Platform and Device Info */
ret = clGetPlatformIDs(1, &platform_id, &ret_num_platforms);
ret = clGetDeviceIDs(platform_id, CL_DEVICE_TYPE_DEFAULT, 1, &device_id, &ret_num_devices);

/* Create OpenCL context */
context = clCreateContext(NULL, 1, &device_id, NULL, NULL, &ret);

/* Create Command Queue */
command_queue = clCreateCommandQueue(context, device_id, 0, &ret);

/* Create Memory Buffer */
memobj = clCreateBuffer(context, CL_MEM_READ_WRITE,MEM_SIZE * sizeof(char), NULL, &ret);

/* Create Kernel Program from the source */
program = clCreateProgramWithSource(context, 1, (const char **)&source_str,
(const size_t *)&source_size, &ret);

/* Build Kernel Program */
ret = clBuildProgram(program, 1, &device_id, NULL, NULL, NULL);

/* Create OpenCL Kernel */
kernel = clCreateKernel(program, "hello", &ret);

/* Set OpenCL Kernel Parameters */
ret = clSetKernelArg(kernel, 0, sizeof(cl_mem), (void *)&memobj);

/* Execute OpenCL Kernel */
ret = clEnqueueTask(command_queue, kernel, 0, NULL,NULL);

/* Copy results from the memory buffer */
ret = clEnqueueReadBuffer(command_queue, memobj, CL_TRUE, 0,
MEM_SIZE * sizeof(char),string, 0, NULL, NULL);

/* Display Result */
puts(string);

/* Finalization */
ret = clFlush(command_queue);
ret = clFinish(command_queue);
ret = clReleaseKernel(kernel);
ret = clReleaseProgram(program);
ret = clReleaseMemObject(memobj);
ret = clReleaseCommandQueue(command_queue);
ret = clReleaseContext(context);

free(source_str);

return 0;
}

The include header is located in a different directory depending on the environment (Table 3.1). Make sure to specify the correct location.

Table 3.1: Include header location (as of March 2010)

OpenCL implementation Include Header
AMD CL/cl.h
Apple OpenCL/opencl.h
FOXC CL/cl.h
NVIDIA CL/cl.h

The sample code defines the following macro so that the header is correctly included in any environment.

List 3.1: Include Header Location (As of March, 2010)

#ifdef __APPLE__
#include <OpenCL/opencl.h>
#else
#include <cl.h>
#endif

Building in Linux/Mac OS X

Once the program is written, we are now ready to build and run the program. This section will describe the procedure under Linux/Mac OS X. The kernel and host code are assumed to exist within the same directory.

The procedures for building vary depending on the OpenCL implementation. "path-to-..." should be replaced with the corresponding OpenCL SDK path. The default SDK path is as shown in Figure 3.2.

Table 3.2: Path to SDK

SDK Path
AMD Stream SDK 2.0 beta4 Path-to-AMD/ati-stream-sdk-v2.0-beta4-lnx32 (32-bit Linux)
Path-to-AMD/ati-stream-sdk-v2.0-beta4-lnx64 (64-bit Linux)
FOXC Path-to-foxc/foxc-install
NVIDIA GPU Computing SDK $(HOME)/NVIDIA_GPU_Computing_SDK (Linux)

The build command on Linux/Max OS X are as follows:

AMD OpenCL

> gcc -I /path-to-AMD/include -L/path-to-AMD/lib/x86 -o hello hello.c -Wl,-rpath,/path-to-AMD/lib/x86 -lOpenCL (32-bit Linux)
> gcc -I /path-to-AMD/include -L/path-to-AMD/lib/x86_64 -o hello hello.c -Wl,-rpath,/path-to-AMD/lib/x86_64 -lOpenCL (64-bit Linux)

FOXC

> gcc -I /path-to-foxc/include -L /path-to-foxc/lib -o hello hello.c -Wl,-rpath,/path-to-foxc/lib -lOpenCL

Apple

> gcc -o hello hello.c -framework opencl

NVIDIA

> gcc -I /path-to-NVIDIA/OpenCL/common/inc -L /path-to-NVIDIA/OpenCL/common/lib/Linux32 -o hello hello.c -lOpenCL (32-bit Linux)
> gcc -I /path-to-NVIDIA/OpenCL/common/inc -L /path-to-NVIDIA/OpenCL/common/lib/Linux64 -o hello hello.c -lOpenCL (64-bit Linux)

Alternatively, you can use the Makefile included with the sample code to run the OpenCL code in various platforms as written below.

> make amd (Linux)
> make apple (Mac OS X)
> make foxc (Linux)
> make nvidia (Linux)

This should create an executable with the name "hello" in working directory. Run the executable as follows. If successful, you should get "Hello World!" on the screen.

> ./../hello
Hello World!

Building on Visual Studio

This section will walk through the building and execution process using Visual C++ 2008 Express under 32-bit Windows Vista environment. The OpenCL header file and library can be included to be used on a project using the following steps.

1. From the project page, go to “C/C++ ” -> “General ”, then add the following in the box for “Additional include directories ”:

NVIDIA

C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK\OpenCL\common\inc

AMD

C:\Program Files\ATI Stream\include

2. From the project page, go to “Linker ” -> “Input”, and in the box for “Additional library path”, type the following.

NVIDIA

C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK\OpenCL\common\lib\Win32

AMD

C:\Program Files\ATI Stream\lib\x86

3. From the project page, go to "Linker" -> "Input", and in the box for "Additional Dependencies", type the following.

Both NVIDIA and AMD

OpenCL.lib

These should apply to All Configurations, which can be selected on the pull-down menu located on the top left corner.

The environment should now be setup to allow an OpenCL code to be built on. Build and run the sample code, and make sure you get the correct output.