Both the host application and kernels may run on the same CPU. Note: The host device (the CPU) can itself be an OpenCL device. (This step may occur earlier in the process, as convenient.) Selects compute devices appropriate for the application.Ĭreates dispatch queues for selected compute devices.Īllocates the memory objects needed by the kernels for execution. Before it runs the kernels, the host application typically:ĭetermines what compute devices are available, if necessary. The device on which the host application executes is known as the host device. The host application is run by OS X on the CPU. The program that calls OpenCL functions to set up the context in which kernels run and enqueue the kernels for execution is known as the host application.
However, a kernel differs from a function called by another programming language because when you invoke “a” kernel, what actually happens is that many instances of the kernel execute, each of which processes a different chunk of data. The work item IDs are organized in up to three dimensions (called an N-D range).Ī kernel is essentially a function written in the OpenCL language that enables it to be compiled for execution on any device that supports OpenCL. In a data parallel program, the same program (or kernel) runs concurrently on different pieces of data and each invocation is called a work item and given a work item ID. The OS X v10.7 implementation of the OpenCL API facilitates designing and coding data parallel programs to run on both CPU and GPU devices. OpenCL considers a CPU with eight compute units and a GPU with 100 compute units each to be a single device. To OpenCL the number of compute units is irrelevant.
A graphics processing unit (GPU) typically contains many compute units-GPUs in current Mac systems feature tens of compute units, and future GPUs may contain hundreds. The number of compute units in a CPU limits the number of workgroups that can execute concurrently.ĬPUs usually contain between two and eight compute units, sometimes more. The CPU on a Mac has multiple compute units, which is why it is called a multicore CPU. It may not have any GPUs or it may have several. A compute unit is composed of one or more processing elements and local memory.Ī Mac computer always has a single CPU. A workgroup executes on a single compute unit.
An OpenCL device has one or more compute units. In the OpenCL specification, computational processors are called devices. If you need to create OpenCL programs at runtime, with source loaded as a string or from a file, or if you want API-level control over queueing, see The OpenCL Specification, available from the Khronos Group at. Tools provided on OS X let you include OpenCL kernels as resources in Xcode projects, compile them along with the rest of your application, invoke kernels by passing them parameters just as if they were typical functions, and use Grand Central Dispatch (GCD) as the queuing API for executing OpenCL commands and kernels on the CPU and GPU.
To create high-performance code on GPUs, use the Metal framework instead.