Table of Contents

Class MetalAccelerator

Namespace
DotCompute.Backends.Metal.Accelerators
Assembly
DotCompute.Backends.Metal.dll

Metal-based compute accelerator for macOS and iOS devices. Migrated to use BaseAccelerator, reducing code by 65% while maintaining full functionality.

public sealed class MetalAccelerator : BaseAccelerator, IAccelerator, IAsyncDisposable
Inheritance
MetalAccelerator
Implements
Inherited Members

Constructors

MetalAccelerator(IOptions<MetalAcceleratorOptions>, ILogger<MetalAccelerator>, IOptions<MetalTelemetryOptions>?, ILoggerFactory?)

public MetalAccelerator(IOptions<MetalAcceleratorOptions> options, ILogger<MetalAccelerator> logger, IOptions<MetalTelemetryOptions>? telemetryOptions = null, ILoggerFactory? loggerFactory = null)

Parameters

options IOptions<MetalAcceleratorOptions>
logger ILogger<MetalAccelerator>
telemetryOptions IOptions<MetalTelemetryOptions>
loggerFactory ILoggerFactory

Properties

Device

Gets the native Metal device handle for this accelerator.

public nint Device { get; }

Property Value

nint

Methods

CompileKernelCoreAsync(KernelDefinition, CompilationOptions, CancellationToken)

Core kernel compilation logic to be implemented by derived classes.

protected override ValueTask<ICompiledKernel> CompileKernelCoreAsync(KernelDefinition definition, CompilationOptions options, CancellationToken cancellationToken)

Parameters

definition KernelDefinition
options CompilationOptions
cancellationToken CancellationToken

Returns

ValueTask<ICompiledKernel>

DisposeCoreAsync()

Core disposal logic to be implemented by derived classes.

protected override ValueTask DisposeCoreAsync()

Returns

ValueTask

ExecuteKernelAsync(ICompiledKernel, GridDimensions, GridDimensions, params IUnifiedMemoryBuffer[])

Executes a compiled Metal kernel with explicit grid and thread group dimensions (convenience overload).

public Task ExecuteKernelAsync(ICompiledKernel kernel, GridDimensions gridDim, GridDimensions blockDim, params IUnifiedMemoryBuffer[] buffers)

Parameters

kernel ICompiledKernel

The compiled Metal kernel to execute.

gridDim GridDimensions

The grid dimensions (number of thread groups to dispatch).

blockDim GridDimensions

The thread group dimensions (threads per thread group).

buffers IUnifiedMemoryBuffer[]

Buffer parameters to bind to the kernel.

Returns

Task

A task representing the asynchronous execution operation.

Remarks

This is a convenience overload that accepts params buffers for easier calling.

ExecuteKernelAsync(ICompiledKernel, GridDimensions, GridDimensions, IUnifiedMemoryBuffer[], CancellationToken)

Executes a compiled Metal kernel with explicit grid and thread group dimensions.

public Task ExecuteKernelAsync(ICompiledKernel kernel, GridDimensions gridDim, GridDimensions blockDim, IUnifiedMemoryBuffer[] buffers, CancellationToken cancellationToken = default)

Parameters

kernel ICompiledKernel

The compiled Metal kernel to execute.

gridDim GridDimensions

The grid dimensions (number of thread groups to dispatch).

blockDim GridDimensions

The thread group dimensions (threads per thread group).

buffers IUnifiedMemoryBuffer[]

Buffer parameters to bind to the kernel.

cancellationToken CancellationToken

Optional cancellation token.

Returns

Task

A task representing the asynchronous execution operation.

Remarks

This method provides explicit control over kernel launch dimensions for advanced use cases like integration testing or performance optimization.

Grid dimensions specify how many thread groups to launch. Block dimensions specify threads per thread group. Total threads = gridDim × blockDim.

Example: gridDim=(10, 1, 1), blockDim=(256, 1, 1) launches 2,560 total threads.

Exceptions

ArgumentNullException

Thrown when kernel or buffers is null.

ArgumentException

Thrown when kernel is not a Metal kernel or dimensions are invalid.

ObjectDisposedException

Thrown when the accelerator has been disposed.

ExportTelemetryAsync(CancellationToken)

Exports metrics to configured monitoring systems (if telemetry is enabled).

public Task ExportTelemetryAsync(CancellationToken cancellationToken = default)

Parameters

cancellationToken CancellationToken

Returns

Task

GeneratePerformanceReport()

Generates a performance report for this accelerator.

public string GeneratePerformanceReport()

Returns

string

GetHealthSnapshotAsync(CancellationToken)

Gets a comprehensive health snapshot of the Metal device.

public override ValueTask<DeviceHealthSnapshot> GetHealthSnapshotAsync(CancellationToken cancellationToken = default)

Parameters

cancellationToken CancellationToken

Cancellation token.

Returns

ValueTask<DeviceHealthSnapshot>

A task containing the device health snapshot.

Remarks

This method integrates with the existing Metal telemetry and health monitoring system. Unlike CUDA which can query detailed hardware metrics via NVML, Metal provides limited hardware introspection capabilities.

Available Metrics: - Overall health status from circuit breakers and error tracking - Memory pressure levels (system-wide, not GPU-specific) - Component health (memory, device, kernel, compiler) - Error counts and consecutive failures

Limitations on Apple Silicon: - No detailed power consumption metrics - No clock frequency reporting - No PCIe metrics (unified memory architecture) - Limited thermal sensors (OS-level only) - No per-GPU memory metrics (unified memory)

Performance: Typically takes less than 1ms as it queries in-memory telemetry data rather than hardware sensors.

GetPerformanceMetrics()

Gets performance metrics for this accelerator.

public Dictionary<string, PerformanceMetrics> GetPerformanceMetrics()

Returns

Dictionary<string, PerformanceMetrics>

GetProfilingMetricsAsync(CancellationToken)

Gets current profiling metrics from the Metal device.

public override ValueTask<IReadOnlyList<ProfilingMetric>> GetProfilingMetricsAsync(CancellationToken cancellationToken = default)

Parameters

cancellationToken CancellationToken

Cancellation token.

Returns

ValueTask<IReadOnlyList<ProfilingMetric>>

A task containing the collection of profiling metrics.

GetProfilingSnapshotAsync(CancellationToken)

Gets a comprehensive profiling snapshot of Metal accelerator performance.

public override ValueTask<ProfilingSnapshot> GetProfilingSnapshotAsync(CancellationToken cancellationToken = default)

Parameters

cancellationToken CancellationToken

Cancellation token.

Returns

ValueTask<ProfilingSnapshot>

A task containing the profiling snapshot.

GetSensorReadingsAsync(CancellationToken)

Gets current sensor readings from the Metal device.

public override ValueTask<IReadOnlyList<SensorReading>> GetSensorReadingsAsync(CancellationToken cancellationToken = default)

Parameters

cancellationToken CancellationToken

Cancellation token.

Returns

ValueTask<IReadOnlyList<SensorReading>>

A task containing the collection of sensor readings.

Remarks

Metal provides fewer hardware sensors than CUDA due to Apple's platform restrictions. Available sensors focus on system-level metrics and component health rather than low-level hardware metrics.

GetTelemetryReport()

Gets comprehensive production telemetry report (if telemetry is enabled).

public MetalProductionReport? GetTelemetryReport()

Returns

MetalProductionReport

GetTelemetrySnapshot()

Gets current telemetry snapshot (if telemetry is enabled).

public MetalTelemetrySnapshot? GetTelemetrySnapshot()

Returns

MetalTelemetrySnapshot

ResetAsync(ResetOptions?, CancellationToken)

Resets the Metal device to a clean state.

public override ValueTask<ResetResult> ResetAsync(ResetOptions? options = null, CancellationToken cancellationToken = default)

Parameters

options ResetOptions
cancellationToken CancellationToken

Returns

ValueTask<ResetResult>

Remarks

Metal reset operations:

  • Soft: Wait for command buffer completion (synchronization)
  • Context: Clear command buffer pool, clear kernel cache
  • Hard: Release memory allocations, clear all caches, reset pools
  • Full: Complete device reset with full reinitialization

Note: Metal does not have a direct device reset API like CUDA's cudaDeviceReset(). Reset is achieved through resource cleanup, pool clearing, and cache management.

ResetPerformanceMetrics()

Resets performance metrics for this accelerator.

public void ResetPerformanceMetrics()

SynchronizeCoreAsync(CancellationToken)

Core synchronization logic to be implemented by derived classes.

protected override ValueTask SynchronizeCoreAsync(CancellationToken cancellationToken)

Parameters

cancellationToken CancellationToken

Returns

ValueTask