Table of Contents

Class HardwareSimdKernelExecutor

Namespace
DotCompute.Backends.CPU.Kernels
Assembly
DotCompute.Backends.CPU.dll

High-performance SIMD kernel executor with hardware-specific optimizations.

public sealed class HardwareSimdKernelExecutor
Inheritance
HardwareSimdKernelExecutor
Inherited Members

Remarks

Initializes a new instance of the HardwareSimdKernelExecutor class.

Constructors

HardwareSimdKernelExecutor(SimdSummary)

High-performance SIMD kernel executor with hardware-specific optimizations.

public HardwareSimdKernelExecutor(SimdSummary simdCapabilities)

Parameters

simdCapabilities SimdSummary

The simd capabilities.

Remarks

Initializes a new instance of the HardwareSimdKernelExecutor class.

Exceptions

ArgumentNullException

simdCapabilities

Properties

MaxVectorElements

Gets the maximum number of elements that can be processed in a single vector operation.

public int MaxVectorElements { get; }

Property Value

int

Exceptions

ArgumentNullException

simdCapabilities

OptimalWorkGroupSize

Gets the optimal work group size for vectorized operations.

public int OptimalWorkGroupSize { get; }

Property Value

int

Exceptions

ArgumentNullException

simdCapabilities

Methods

Execute(Span<byte>, Span<byte>, Span<byte>, int, int)

Executes a vectorized kernel with optimal SIMD instructions.

public void Execute(Span<byte> input1, Span<byte> input2, Span<byte> output, int elementCount, int vectorWidth)

Parameters

input1 Span<byte>
input2 Span<byte>
output Span<byte>
elementCount int
vectorWidth int

Exceptions

ArgumentNullException

simdCapabilities

ExecuteFma(Span<byte>, Span<byte>, Span<byte>, Span<byte>, int)

Executes a fused multiply-add operation: output = (input1 * input2) + input3.

public void ExecuteFma(Span<byte> input1, Span<byte> input2, Span<byte> input3, Span<byte> output, int elementCount)

Parameters

input1 Span<byte>
input2 Span<byte>
input3 Span<byte>
output Span<byte>
elementCount int

Exceptions

ArgumentNullException

simdCapabilities

ExecuteUnary(Span<byte>, Span<byte>, int, UnaryOperation)

Executes a unary vectorized operation (single input, single output).

public void ExecuteUnary(Span<byte> input, Span<byte> output, int elementCount, UnaryOperation operation)

Parameters

input Span<byte>
output Span<byte>
elementCount int
operation UnaryOperation

Exceptions

ArgumentNullException

simdCapabilities

IsVectorizationBeneficial(int)

Determines if the specified element count is suitable for vectorization.

public bool IsVectorizationBeneficial(int elementCount)

Parameters

elementCount int

Returns

bool

Exceptions

ArgumentNullException

simdCapabilities