Interface class for the reduce
kernels.
More...
#include <algorithms.hpp>
Public Types | |
enum | Memory : uint8_t { Memory::H_IN, Memory::H_OUT, Memory::D_IN, Memory::D_RED, Memory::D_OUT } |
Enumerates the memory objects handled by the class. More... | |
Public Member Functions | |
Reduce (clutils::CLEnv &_env, clutils::CLEnvInfo< 1 > _info) | |
Configures an OpenCL environment as specified by _info . More... | |
cl::Memory & | get (Reduce::Memory mem) |
Returns a reference to an internal memory object. More... | |
void | init (unsigned int _cols, unsigned int _rows, Staging _staging=Staging::IO) |
Configures kernel execution parameters. More... | |
void | write (Reduce::Memory mem=Reduce::Memory::D_IN, void *ptr=nullptr, bool block=CL_FALSE, const std::vector< cl::Event > *events=nullptr, cl::Event *event=nullptr) |
Performs a data transfer to a device buffer. More... | |
void * | read (Reduce::Memory mem=Reduce::Memory::H_OUT, bool block=CL_TRUE, const std::vector< cl::Event > *events=nullptr, cl::Event *event=nullptr) |
Performs a data transfer to a staging buffer. More... | |
void | run (const std::vector< cl::Event > *events=nullptr, cl::Event *event=nullptr) |
Executes the necessary kernels. More... | |
template<typename period > | |
double | run (clutils::GPUTimer< period > &timer, const std::vector< cl::Event > *events=nullptr) |
Executes the necessary kernels. More... | |
template<> | |
Reduce (clutils::CLEnv &_env, clutils::CLEnvInfo< 1 > _info) | |
template<> | |
Reduce (clutils::CLEnv &_env, clutils::CLEnvInfo< 1 > _info) | |
template<> | |
Reduce (clutils::CLEnv &_env, clutils::CLEnvInfo< 1 > _info) | |
Public Attributes | |
T * | hPtrIn |
T * | hPtrOut |
Interface class for the reduce
kernels.
The reduce
kernels reduce each row of an array to a single element. For more details, look at the kernels' documentation.
reduce
kernels are available in kernels/reduce_kernels.cl
. get
to get references to the placeholders within the class and assign them to your buffers. You will have to do this strictly before the call to init
. You can also call get
(after the call to init
) to get a reference to a buffer within the class and assign it to another kernel class instance further down in your task pipeline.The following input/output OpenCL
memory objects are created by a Reduce
instance:
Name | Type | Placement | I/O | Use | Properties | Size |
---|---|---|---|---|---|---|
H_IN | Buffer | Host | I | Staging | CL_MEM_READ_WRITE | \(columns*rows*sizeof\ (T)\) |
H_OUT | Buffer | Host | O | Staging | CL_MEM_READ_WRITE | \( rows*sizeof\ (T)\) |
D_IN | Buffer | Device | I | Processing | CL_MEM_READ_ONLY | \(columns*rows*sizeof\ (T)\) |
D_OUT | Buffer | Device | O | Processing | CL_MEM_WRITE_ONLY | \( rows*sizeof\ (T)\) |
C | configures the class for different types of reduction. |
T | configures the class to work with different types of data. |
|
strong |
Enumerates the memory objects handled by the class.
H_*
names refer to staging buffers on the host. D_*
names refer to buffers on the device. Enumerator | |
---|---|
H_IN |
Input staging buffer. |
H_OUT |
Output staging buffer. |
D_IN |
Input buffer. |
D_RED |
Buffer of reduced elements per work-group. |
D_OUT |
Output buffer. |
cl_algo::ICP::Reduce< C, T >::Reduce | ( | clutils::CLEnv & | _env, |
clutils::CLEnvInfo< 1 > | _info | ||
) |
Configures an OpenCL environment as specified by _info
.
cl_algo::ICP::Reduce< ReduceConfig::MIN, cl_float >::Reduce | ( | clutils::CLEnv & | _env, |
clutils::CLEnvInfo< 1 > | _info | ||
) |
[in] | _env | opencl environment. |
[in] | _info | opencl configuration. It specifies the context, queue, etc, to be used. |
cl_algo::ICP::Reduce< ReduceConfig::MAX, cl_uint >::Reduce | ( | clutils::CLEnv & | _env, |
clutils::CLEnvInfo< 1 > | _info | ||
) |
[in] | _env | opencl environment. |
[in] | _info | opencl configuration. It specifies the context, queue, etc, to be used. |
cl_algo::ICP::Reduce< ReduceConfig::SUM, cl_float >::Reduce | ( | clutils::CLEnv & | _env, |
clutils::CLEnvInfo< 1 > | _info | ||
) |
[in] | _env | opencl environment. |
[in] | _info | opencl configuration. It specifies the context, queue, etc, to be used. |
cl::Memory & cl_algo::ICP::Reduce< C, T >::get | ( | Reduce< C, T >::Memory | mem | ) |
Returns a reference to an internal memory object.
This interface exists to allow CL memory sharing between different kernels.
[in] | mem | enumeration value specifying the requested memory object. |
void cl_algo::ICP::Reduce< C, T >::init | ( | unsigned int | _cols, |
unsigned int | _rows, | ||
Staging | _staging = Staging::IO |
||
) |
Configures kernel execution parameters.
Sets up memory objects as necessary, and defines the kernel workspaces.
init
, then that memory will be maintained. Otherwise, a new memory object will be created.[in] | _cols | number of columns in the input array. |
[in] | _rows | number of rows in the input array. |
[in] | _staging | flag to indicate whether or not to instantiate the staging buffers. |
void * cl_algo::ICP::Reduce< C, T >::read | ( | Reduce< C, T >::Memory | mem = Reduce< C, T >::Memory::H_OUT , |
bool | block = CL_TRUE , |
||
const std::vector< cl::Event > * | events = nullptr , |
||
cl::Event * | event = nullptr |
||
) |
Performs a data transfer to a staging buffer.
The transfer happens from a device buffer to the associated (specified) staging buffer on the host.
[in] | mem | enumeration value specifying an output staging buffer. |
[in] | block | a flag to indicate whether to perform a blocking or a non-blocking operation. |
[in] | events | a wait-list of events. |
[out] | event | event associated with the read operation to the staging buffer. |
void cl_algo::ICP::Reduce< C, T >::run | ( | const std::vector< cl::Event > * | events = nullptr , |
cl::Event * | event = nullptr |
||
) |
Executes the necessary kernels.
The function call is non-blocking.
[in] | events | a wait-list of events. |
[out] | event | event associated with the kernel execution. |
|
inline |
Executes the necessary kernels.
This run
instance is used for profiling.
[in] | timer | GPUTimer that does the profiling of the kernel executions. |
[in] | events | a wait-list of events. |
void cl_algo::ICP::Reduce< C, T >::write | ( | Reduce< C, T >::Memory | mem = Reduce< C, T >::Memory::D_IN , |
void * | ptr = nullptr , |
||
bool | block = CL_FALSE , |
||
const std::vector< cl::Event > * | events = nullptr , |
||
cl::Event * | event = nullptr |
||
) |
Performs a data transfer to a device buffer.
The transfer happens from a staging buffer on the host to the associated (specified) device buffer.
[in] | mem | enumeration value specifying an input device buffer. |
[in] | ptr | a pointer to an array holding input data. If not NULL, the data from ptr will be copied to the associated staging buffer. |
[in] | block | a flag to indicate whether to perform a blocking or a non-blocking operation. |
[in] | events | a wait-list of events. |
[out] | event | event associated with the write operation to the device buffer. |
T* cl_algo::ICP::Reduce< C, T >::hPtrIn |
Mapping of the input staging buffer.
T* cl_algo::ICP::Reduce< C, T >::hPtrOut |
Mapping of the output staging buffer.