Table of Contents

Class CudaTensorCoreManagerProduction

Namespace
DotCompute.Backends.CUDA.Advanced
Assembly
DotCompute.Backends.CUDA.dll

Production-grade CUDA Tensor Core manager with WMMA operations, mixed precision support, and performance profiling.

public sealed class CudaTensorCoreManagerProduction : IDisposable
Inheritance
CudaTensorCoreManagerProduction
Implements
Inherited Members
Extension Methods

Constructors

CudaTensorCoreManagerProduction(CudaContext, CudaDeviceManager, ILogger<CudaTensorCoreManagerProduction>)

Initializes a new instance of the CudaTensorCoreManagerProduction class.

public CudaTensorCoreManagerProduction(CudaContext context, CudaDeviceManager deviceManager, ILogger<CudaTensorCoreManagerProduction> logger)

Parameters

context CudaContext

The context.

deviceManager CudaDeviceManager

The device manager.

logger ILogger<CudaTensorCoreManagerProduction>

The logger.

Properties

Capabilities

Gets tensor core capabilities.

public TensorCoreCapabilities Capabilities { get; }

Property Value

TensorCoreCapabilities

Statistics

Gets performance statistics.

public TensorCoreStatistics Statistics { get; }

Property Value

TensorCoreStatistics

TensorCoresAvailable

Gets whether tensor cores are available.

public bool TensorCoresAvailable { get; }

Property Value

bool

Methods

ConvolutionAsync(nint, nint, nint, ConvolutionParams, DataType, nint, CancellationToken)

Performs convolution using tensor cores.

public Task<TensorCoreResult> ConvolutionAsync(nint input, nint filter, nint output, ConvolutionParams parameters, DataType dataType, nint stream, CancellationToken cancellationToken = default)

Parameters

input nint
filter nint
output nint
parameters ConvolutionParams
dataType DataType
stream nint
cancellationToken CancellationToken

Returns

Task<TensorCoreResult>

Dispose()

Performs dispose.

public void Dispose()

MatrixMultiplyAsync(nint, nint, nint, int, int, int, DataType, DataType, nint, MatrixLayout, MatrixLayout, float, float, CancellationToken)

Performs mixed-precision matrix multiplication using tensor cores.

public Task<TensorCoreResult> MatrixMultiplyAsync(nint a, nint b, nint c, int m, int n, int k, DataType inputType, DataType outputType, nint stream, MatrixLayout layoutA = MatrixLayout.RowMajor, MatrixLayout layoutB = MatrixLayout.RowMajor, float alpha = 1, float beta = 0, CancellationToken cancellationToken = default)

Parameters

a nint
b nint
c nint
m int
n int
k int
inputType DataType
outputType DataType
stream nint
layoutA MatrixLayout
layoutB MatrixLayout
alpha float
beta float
cancellationToken CancellationToken

Returns

Task<TensorCoreResult>