Table of Contents

Class OpenCLAccelerator

Namespace
DotCompute.Backends.OpenCL
Assembly
DotCompute.Backends.OpenCL.dll

Production-ready OpenCL implementation of the compute accelerator interface. Integrates all Phase 1 and Phase 2 Week 1 infrastructure for complete OpenCL support.

public sealed class OpenCLAccelerator : IAccelerator, IAsyncDisposable
Inheritance
OpenCLAccelerator
Implements
Inherited Members

Remarks

This accelerator provides comprehensive OpenCL functionality with:

  • Memory Management: Buffer pooling with 90%+ allocation reduction via OpenCLMemoryPoolManager
  • Compilation: Multi-tier caching (memory + disk) via OpenCLCompilationCache
  • Execution: NDRange kernel dispatch with automatic work size optimization via OpenCLKernelExecutionEngine
  • Profiling: Event-based performance tracking with hardware counter integration via OpenCLProfiler
  • Monitoring: Real-time metrics collection and SLA compliance tracking
  • Vendor Optimization: Automatic detection and application of vendor-specific optimizations

Thread-safe, async-first design with comprehensive error handling and diagnostic logging.

Constructors

OpenCLAccelerator(OpenCLDeviceInfo, ILoggerFactory, OpenCLConfiguration?)

Initializes a new instance with a specific device.

public OpenCLAccelerator(OpenCLDeviceInfo device, ILoggerFactory loggerFactory, OpenCLConfiguration? configuration = null)

Parameters

device OpenCLDeviceInfo

The OpenCL device to use.

loggerFactory ILoggerFactory

Logger factory for creating loggers.

configuration OpenCLConfiguration

Optional configuration for the accelerator. If null, default configuration is used.

OpenCLAccelerator(OpenCLDeviceInfo, ILogger<OpenCLAccelerator>, OpenCLConfiguration?)

Initializes a new instance with a specific device.

public OpenCLAccelerator(OpenCLDeviceInfo device, ILogger<OpenCLAccelerator> logger, OpenCLConfiguration? configuration = null)

Parameters

device OpenCLDeviceInfo

The OpenCL device to use.

logger ILogger<OpenCLAccelerator>

Logger for diagnostic information.

configuration OpenCLConfiguration

Optional configuration for the accelerator. If null, default configuration is used.

OpenCLAccelerator(ILoggerFactory, OpenCLConfiguration?)

Initializes a new instance of the OpenCLAccelerator class.

public OpenCLAccelerator(ILoggerFactory loggerFactory, OpenCLConfiguration? configuration = null)

Parameters

loggerFactory ILoggerFactory

Logger factory for creating loggers.

configuration OpenCLConfiguration

Optional configuration for the accelerator. If null, default configuration is used.

OpenCLAccelerator(ILogger<OpenCLAccelerator>, OpenCLConfiguration?)

Initializes a new instance of the OpenCLAccelerator class.

public OpenCLAccelerator(ILogger<OpenCLAccelerator> logger, OpenCLConfiguration? configuration = null)

Parameters

logger ILogger<OpenCLAccelerator>

Logger for diagnostic information.

configuration OpenCLConfiguration

Optional configuration for the accelerator. If null, default configuration is used.

Properties

Configuration

Gets the configuration used by this accelerator. Contains settings for stream management, event pooling, memory management, and vendor-specific optimizations.

public OpenCLConfiguration Configuration { get; }

Property Value

OpenCLConfiguration

The OpenCL configuration instance. Never null.

Context

Gets the accelerator context.

public AcceleratorContext Context { get; }

Property Value

AcceleratorContext

DeviceInfo

Gets the device information for the selected device.

public OpenCLDeviceInfo? DeviceInfo { get; }

Property Value

OpenCLDeviceInfo

DeviceType

Gets the device type as a string.

public string DeviceType { get; }

Property Value

string

EventManager

Gets the event manager for this accelerator. Provides pooled events for synchronization and profiling.

public OpenCLEventManager EventManager { get; }

Property Value

OpenCLEventManager

Exceptions

InvalidOperationException

Thrown when accelerator is not initialized.

Id

Gets the unique identifier for this accelerator instance.

public Guid Id { get; }

Property Value

Guid

Info

Gets the accelerator information.

public AcceleratorInfo Info { get; }

Property Value

AcceleratorInfo

IsAvailable

Gets whether the accelerator is available for use.

public bool IsAvailable { get; }

Property Value

bool

IsDisposed

Gets whether the accelerator has been disposed.

public bool IsDisposed { get; }

Property Value

bool

Memory

Gets the memory manager for this accelerator.

public IUnifiedMemoryManager Memory { get; }

Property Value

IUnifiedMemoryManager

MemoryManager

Gets the memory manager for this accelerator (alias for Memory).

public IUnifiedMemoryManager MemoryManager { get; }

Property Value

IUnifiedMemoryManager

Name

Gets the accelerator name.

public string Name { get; }

Property Value

string

StreamManager

Gets the stream (command queue) manager for this accelerator. Provides pooled command queues for asynchronous execution.

public OpenCLStreamManager StreamManager { get; }

Property Value

OpenCLStreamManager

Exceptions

InvalidOperationException

Thrown when accelerator is not initialized.

Type

Gets the accelerator type.

public AcceleratorType Type { get; }

Property Value

AcceleratorType

VendorAdapter

Gets the vendor-specific adapter for this accelerator. Provides vendor-specific optimizations and extensions.

public IOpenCLVendorAdapter VendorAdapter { get; }

Property Value

IOpenCLVendorAdapter

Exceptions

InvalidOperationException

Thrown when accelerator is not initialized.

Methods

AllocateAsync<T>(nuint, MemoryOptions?, CancellationToken)

Creates a memory buffer of the specified type and size.

public Task<IUnifiedMemoryBuffer<T>> AllocateAsync<T>(nuint elementCount, MemoryOptions? options = null, CancellationToken cancellationToken = default) where T : unmanaged

Parameters

elementCount nuint

Number of elements in the buffer.

options MemoryOptions?

Memory allocation options.

cancellationToken CancellationToken

Cancellation token for the operation.

Returns

Task<IUnifiedMemoryBuffer<T>>

A new memory buffer.

Type Parameters

T

The element type for the buffer.

CompileKernelAsync(KernelDefinition, CompilationOptions?, CancellationToken)

Compiles a kernel from source code.

public ValueTask<ICompiledKernel> CompileKernelAsync(KernelDefinition definition, CompilationOptions? options = null, CancellationToken cancellationToken = default)

Parameters

definition KernelDefinition

Kernel definition containing source code and entry point.

options CompilationOptions

Compilation options.

cancellationToken CancellationToken

Cancellation token for the operation.

Returns

ValueTask<ICompiledKernel>

A compiled kernel ready for execution.

Dispose()

Disposes the OpenCL accelerator and associated resources.

public void Dispose()

DisposeAsync()

Asynchronously disposes the OpenCL accelerator.

public ValueTask DisposeAsync()

Returns

ValueTask

GetHealthSnapshotAsync(CancellationToken)

Gets a comprehensive health snapshot of the OpenCL device.

public ValueTask<DeviceHealthSnapshot> GetHealthSnapshotAsync(CancellationToken cancellationToken = default)

Parameters

cancellationToken CancellationToken

Cancellation token.

Returns

ValueTask<DeviceHealthSnapshot>

A task containing the device health snapshot.

Remarks

OpenCL provides limited health monitoring capabilities compared to CUDA/NVML or Metal. The standard OpenCL API does not expose detailed hardware health metrics like temperature, power consumption, or real-time utilization.

Available Metrics (Standard OpenCL): - Device availability status - Global memory capacity and cache information - Compute capability (compute units, clock frequency) - Driver and OpenCL version information

Vendor-Specific Extensions (Not Implemented): - NVIDIA: cl_nv_device_attribute_query (temperature, thermal throttling) - AMD: cl_amd_device_attribute_query (temperature, fan speed) - Intel: cl_intel_device_attribute_query (power, frequency)

Limitations: - No real-time utilization metrics (GPU/memory usage) - No temperature sensors (requires vendor extensions) - No power consumption metrics - No throttling status detection - No PCIe metrics

For production health monitoring, consider using vendor-specific tools: - NVIDIA: nvidia-smi, NVML - AMD: rocm-smi, ROCm System Management Interface - Intel: intel-gpu-tools, Level Zero API

Performance: Negligible overhead (sub-millisecond) as metrics are queried from cached device information.

GetProfilingMetricsAsync(CancellationToken)

Gets current profiling metrics from the OpenCL device.

public ValueTask<IReadOnlyList<ProfilingMetric>> GetProfilingMetricsAsync(CancellationToken cancellationToken = default)

Parameters

cancellationToken CancellationToken

Cancellation token.

Returns

ValueTask<IReadOnlyList<ProfilingMetric>>

A task containing the collection of profiling metrics.

GetProfilingSnapshotAsync(CancellationToken)

Gets a comprehensive profiling snapshot of OpenCL accelerator performance.

public ValueTask<ProfilingSnapshot> GetProfilingSnapshotAsync(CancellationToken cancellationToken = default)

Parameters

cancellationToken CancellationToken

Cancellation token.

Returns

ValueTask<ProfilingSnapshot>

A task containing the profiling snapshot.

Remarks

OpenCL profiling uses OpenCL Events for hardware-accurate timing of kernel executions and memory transfers. Event-based profiling provides microsecond precision without CPU overhead.

Available Metrics: - Kernel execution statistics (average, min, max, median, P95, P99) - Memory transfer statistics (host-device bandwidth, transfer counts) - Device utilization (estimated from execution patterns) - Queue wait times and submission latencies - Performance trends and bottleneck identification

Profiling Overhead: Minimal (<0.5%) as OpenCL Events are hardware-managed.

Performance: Typically less than 1ms to collect and aggregate metrics.

GetSensorReadingsAsync(CancellationToken)

Gets current sensor readings from the OpenCL device.

public ValueTask<IReadOnlyList<SensorReading>> GetSensorReadingsAsync(CancellationToken cancellationToken = default)

Parameters

cancellationToken CancellationToken

Cancellation token.

Returns

ValueTask<IReadOnlyList<SensorReading>>

A task containing the collection of sensor readings.

Remarks

OpenCL sensor readings are limited to static device properties. Real-time metrics (temperature, utilization, power) require vendor extensions.

InitializeAsync(CancellationToken)

Initializes the accelerator and selects the best available device.

public Task InitializeAsync(CancellationToken cancellationToken = default)

Parameters

cancellationToken CancellationToken

Cancellation token for the operation.

Returns

Task

ResetAsync(ResetOptions?, CancellationToken)

Resets the OpenCL device to a clean state.

public ValueTask<ResetResult> ResetAsync(ResetOptions? options = null, CancellationToken cancellationToken = default)

Parameters

options ResetOptions
cancellationToken CancellationToken

Returns

ValueTask<ResetResult>

Remarks

OpenCL reset operations:

  • Soft: Flush and finish all command queues (clFlush + clFinish)
  • Context: Clear compilation cache, finish queues, release event resources
  • Hard: Release all memory allocations, clear caches, recreate context
  • Full: Complete context teardown and recreation with full reinitialization

Note: OpenCL does not have a direct device reset API like CUDA's cudaDeviceReset(). Reset is achieved through context recreation and resource cleanup.

SynchronizeAsync(CancellationToken)

Synchronizes all operations on the accelerator.

public ValueTask SynchronizeAsync(CancellationToken cancellationToken = default)

Parameters

cancellationToken CancellationToken

Cancellation token for the operation.

Returns

ValueTask