Table of Contents

Class AmdOpenCLAdapter

Namespace
DotCompute.Backends.OpenCL.Vendor
Assembly
DotCompute.Backends.OpenCL.dll

Vendor adapter for AMD GPUs with ROCm-specific optimizations.

public sealed class AmdOpenCLAdapter : IOpenCLVendorAdapter
Inheritance
AmdOpenCLAdapter
Implements
Inherited Members

Remarks

AMD's OpenCL implementation has evolved through several GPU architectures:

  • GCN (Graphics Core Next): 64-wide wavefronts, 64KB LDS per CU
  • RDNA/RDNA2: 32-wide wavefronts (dual-issue), improved performance
  • RDNA3: Enhanced ray tracing, AI acceleration

Architecture detection:

  • RX 6000/7000 series: RDNA/RDNA2/RDNA3 (32-wide wavefronts)
  • RX 5000 and older: GCN (64-wide wavefronts)
  • MI series: GCN/CDNA (data center focused)

Optimization priorities:

  1. Align work groups to wavefront boundaries
  2. Use 256-byte alignment for optimal memory access
  3. Leverage out-of-order execution
  4. Enable aggressive math optimizations

Properties

Vendor

Gets the vendor type this adapter handles.

public OpenCLVendor Vendor { get; }

Property Value

OpenCLVendor

VendorName

Gets the vendor's display name.

public string VendorName { get; }

Property Value

string

Methods

ApplyVendorOptimizations(QueueProperties, OpenCLDeviceInfo)

Applies vendor-specific queue properties.

public QueueProperties ApplyVendorOptimizations(QueueProperties properties, OpenCLDeviceInfo device)

Parameters

properties QueueProperties

The base queue properties.

device OpenCLDeviceInfo

The device the queue will operate on.

Returns

QueueProperties

Modified queue properties optimized for this vendor.

Remarks

Some vendors benefit from out-of-order execution, while others work better with in-order queues depending on the workload characteristics.

CanHandle(OpenCLPlatformInfo)

Determines if this adapter can handle the specified platform.

public bool CanHandle(OpenCLPlatformInfo platform)

Parameters

platform OpenCLPlatformInfo

The OpenCL platform to evaluate.

Returns

bool

true if this adapter can handle the platform; otherwise, false.

GetCompilerOptions(bool)

Gets vendor-specific compiler options.

public string GetCompilerOptions(bool enableOptimizations)

Parameters

enableOptimizations bool

Whether to enable aggressive optimizations.

Returns

string

Compiler options string suitable for clBuildProgram.

Remarks

Vendors support different compiler flags:

  • Common: -cl-mad-enable, -cl-fast-relaxed-math
  • NVIDIA: -cl-denorms-are-zero
  • AMD: -cl-unsafe-math-optimizations
  • Intel: Conservative optimizations for better compatibility

GetOptimalLocalMemorySize(OpenCLDeviceInfo)

Gets the optimal local memory size for this vendor.

public long GetOptimalLocalMemorySize(OpenCLDeviceInfo device)

Parameters

device OpenCLDeviceInfo

The device to query.

Returns

long

The recommended local memory size in bytes.

Remarks

Local memory (shared memory in CUDA terms) has vendor-specific limits:

  • NVIDIA: 48KB per SM (configurable with L1 cache)
  • AMD: 64KB per CU
  • Intel: 64KB per subslice

GetOptimalWorkGroupSize(OpenCLDeviceInfo, int)

Gets the optimal work group size for a kernel on this vendor's hardware.

public int GetOptimalWorkGroupSize(OpenCLDeviceInfo device, int defaultSize)

Parameters

device OpenCLDeviceInfo

The device to optimize for.

defaultSize int

The default work group size to use as a baseline.

Returns

int

The optimal work group size for this vendor.

Remarks

Work group sizing is critical for GPU performance:

  • NVIDIA: Prefer multiples of 32 (warp size)
  • AMD: Prefer multiples of 32/64 (wavefront size)
  • Intel: Prefer multiples of 16 (SIMD width)

GetRecommendedBufferAlignment(OpenCLDeviceInfo)

Gets recommended buffer alignment for optimal memory access.

public int GetRecommendedBufferAlignment(OpenCLDeviceInfo device)

Parameters

device OpenCLDeviceInfo

The device to optimize for.

Returns

int

The recommended alignment in bytes.

Remarks

Proper alignment ensures coalesced memory access:

  • NVIDIA: 128-byte alignment for coalescing
  • AMD: 256-byte alignment for optimal access
  • Intel: 64-byte alignment (cache line size)

IsExtensionReliable(string, OpenCLDeviceInfo)

Checks if a specific extension is reliably supported by this vendor. Some vendors report extensions but have bugs/limitations.

public bool IsExtensionReliable(string extension, OpenCLDeviceInfo device)

Parameters

extension string

The extension name (e.g., "cl_khr_fp64").

device OpenCLDeviceInfo

The device to check.

Returns

bool

true if the extension is reliably supported; otherwise, false.

Remarks

Not all advertised extensions work correctly on all hardware. This method allows vendors to blacklist problematic extensions.

SupportsPersistentKernels(OpenCLDeviceInfo)

Indicates if this vendor benefits from persistent kernels.

public bool SupportsPersistentKernels(OpenCLDeviceInfo device)

Parameters

device OpenCLDeviceInfo

The device to check.

Returns

bool

true if persistent kernels are beneficial; otherwise, false.

Remarks

Persistent kernels keep work groups alive across multiple kernel invocations, which can improve performance for streaming workloads on high-end GPUs.