Interface IOpenCLVendorAdapter

Namespace: DotCompute.Backends.OpenCL.Vendor

Assembly: DotCompute.Backends.OpenCL.dll

Defines vendor-specific adaptations for OpenCL implementations. Enables optimizations and workarounds for NVIDIA, AMD, Intel, and other vendors.

public interface IOpenCLVendorAdapter

Remarks

Different OpenCL vendors have varying characteristics:

NVIDIA: Warp-based execution (32 threads), CUDA heritage
AMD: Wavefront-based (32 or 64 threads depending on architecture)
Intel: SIMD-16/32 based execution, varying by generation

This interface allows DotCompute to optimize for each vendor while maintaining a unified programming model.

Properties

Vendor

Gets the vendor type this adapter handles.

OpenCLVendor Vendor { get; }

Property Value

OpenCLVendor

VendorName

Gets the vendor's display name.

string VendorName { get; }

Property Value

string

Methods

ApplyVendorOptimizations(QueueProperties, OpenCLDeviceInfo)

Applies vendor-specific queue properties.

QueueProperties ApplyVendorOptimizations(QueueProperties properties, OpenCLDeviceInfo device)

Parameters

properties QueueProperties: The base queue properties.
device OpenCLDeviceInfo: The device the queue will operate on.

Returns

QueueProperties: Modified queue properties optimized for this vendor.

Remarks

Some vendors benefit from out-of-order execution, while others work better with in-order queues depending on the workload characteristics.

CanHandle(OpenCLPlatformInfo)

Determines if this adapter can handle the specified platform.

bool CanHandle(OpenCLPlatformInfo platform)

Parameters

platform OpenCLPlatformInfo: The OpenCL platform to evaluate.

Returns

bool: true if this adapter can handle the platform; otherwise, false.

GetCompilerOptions(bool)

Gets vendor-specific compiler options.

string GetCompilerOptions(bool enableOptimizations)

Parameters

enableOptimizations bool: Whether to enable aggressive optimizations.

Returns

string: Compiler options string suitable for clBuildProgram.

Remarks

Vendors support different compiler flags:

Common: -cl-mad-enable, -cl-fast-relaxed-math
NVIDIA: -cl-denorms-are-zero
AMD: -cl-unsafe-math-optimizations
Intel: Conservative optimizations for better compatibility

GetOptimalLocalMemorySize(OpenCLDeviceInfo)

Gets the optimal local memory size for this vendor.

long GetOptimalLocalMemorySize(OpenCLDeviceInfo device)

Parameters

device OpenCLDeviceInfo: The device to query.

Returns

long: The recommended local memory size in bytes.

Remarks

Local memory (shared memory in CUDA terms) has vendor-specific limits:

NVIDIA: 48KB per SM (configurable with L1 cache)
AMD: 64KB per CU
Intel: 64KB per subslice

GetOptimalWorkGroupSize(OpenCLDeviceInfo, int)

Gets the optimal work group size for a kernel on this vendor's hardware.

int GetOptimalWorkGroupSize(OpenCLDeviceInfo device, int defaultSize)

Parameters

device OpenCLDeviceInfo: The device to optimize for.
defaultSize int: The default work group size to use as a baseline.

Returns

int: The optimal work group size for this vendor.

Remarks

Work group sizing is critical for GPU performance:

NVIDIA: Prefer multiples of 32 (warp size)
AMD: Prefer multiples of 32/64 (wavefront size)
Intel: Prefer multiples of 16 (SIMD width)

GetRecommendedBufferAlignment(OpenCLDeviceInfo)

Gets recommended buffer alignment for optimal memory access.

int GetRecommendedBufferAlignment(OpenCLDeviceInfo device)

Parameters

device OpenCLDeviceInfo: The device to optimize for.

Returns

int: The recommended alignment in bytes.

Remarks

Proper alignment ensures coalesced memory access:

NVIDIA: 128-byte alignment for coalescing
AMD: 256-byte alignment for optimal access
Intel: 64-byte alignment (cache line size)

IsExtensionReliable(string, OpenCLDeviceInfo)

Checks if a specific extension is reliably supported by this vendor. Some vendors report extensions but have bugs/limitations.

bool IsExtensionReliable(string extension, OpenCLDeviceInfo device)

Parameters

extension string: The extension name (e.g., "cl_khr_fp64").
device OpenCLDeviceInfo: The device to check.

Returns

bool: true if the extension is reliably supported; otherwise, false.

Remarks

Not all advertised extensions work correctly on all hardware. This method allows vendors to blacklist problematic extensions.

SupportsPersistentKernels(OpenCLDeviceInfo)

Indicates if this vendor benefits from persistent kernels.

bool SupportsPersistentKernels(OpenCLDeviceInfo device)

Parameters

device OpenCLDeviceInfo: The device to check.

Returns

bool: true if persistent kernels are beneficial; otherwise, false.

Remarks

Persistent kernels keep work groups alive across multiple kernel invocations, which can improve performance for streaming workloads on high-end GPUs.

Table of Contents

Interface IOpenCLVendorAdapter

Remarks

Properties

Vendor

Property Value

VendorName

Property Value

Methods

ApplyVendorOptimizations(QueueProperties, OpenCLDeviceInfo)

Parameters

Returns

Remarks

CanHandle(OpenCLPlatformInfo)

Parameters

Returns

GetCompilerOptions(bool)

Parameters

Returns

Remarks

GetOptimalLocalMemorySize(OpenCLDeviceInfo)

Parameters

Returns

Remarks

GetOptimalWorkGroupSize(OpenCLDeviceInfo, int)

Parameters

Returns

Remarks

GetRecommendedBufferAlignment(OpenCLDeviceInfo)

Parameters

Returns

Remarks

IsExtensionReliable(string, OpenCLDeviceInfo)

Parameters

Returns

Remarks

SupportsPersistentKernels(OpenCLDeviceInfo)

Parameters

Returns

Remarks