Table of Contents

Class OptimizedSimdExecutor

Namespace
DotCompute.Backends.CPU.Kernels
Assembly
DotCompute.Backends.CPU.dll

Optimized SIMD kernel executor with advanced performance techniques:

  • Instruction-level parallelism with multiple execution units
  • Loop unrolling with optimal stride patterns
  • Branch prediction optimization
  • Cache-friendly memory access patterns
  • Prefetch instructions for improved memory bandwidth
  • Vectorized operations with fallback paths
  • Runtime CPU feature detection and optimization Target: 4-8x performance improvement over scalar code
public sealed class OptimizedSimdExecutor : IDisposable
Inheritance
OptimizedSimdExecutor
Implements
Inherited Members
Extension Methods

Constructors

OptimizedSimdExecutor(ILogger<OptimizedSimdExecutor>, ExecutorConfiguration?)

Initializes a new optimized SIMD executor.

public OptimizedSimdExecutor(ILogger<OptimizedSimdExecutor> logger, ExecutorConfiguration? config = null)

Parameters

logger ILogger<OptimizedSimdExecutor>

Logger for diagnostics.

config ExecutorConfiguration

Executor configuration.

Properties

Statistics

Gets executor performance statistics.

public ExecutorStatistics Statistics { get; }

Property Value

ExecutorStatistics

Methods

AnalyzeWorkload<T>(long)

Analyzes workload characteristics for optimization planning.

public WorkloadProfile AnalyzeWorkload<T>(long elementCount) where T : unmanaged

Parameters

elementCount long

Number of elements.

Returns

WorkloadProfile

Workload analysis profile.

Type Parameters

T

Element type.

Dispose()

Performs dispose.

public void Dispose()

EstimateCacheEfficiency(long, Type)

Estimates cache efficiency for a given workload.

public double EstimateCacheEfficiency(long elementCount, Type elementType)

Parameters

elementCount long

Number of elements.

elementType Type

Type of elements.

Returns

double

Cache efficiency estimate (0.0 to 1.0).

EstimateVectorizationPotential<T>(long)

Estimates vectorization potential for a given workload.

public double EstimateVectorizationPotential<T>(long elementCount) where T : unmanaged

Parameters

elementCount long

Number of elements.

Returns

double

Vectorization potential (0.0 to 1.0).

Type Parameters

T

Element type.

ExecuteReduction<T>(ReadOnlySpan<T>, ReductionOperation)

Executes a reduction operation with optimized SIMD reduction patterns.

public T ExecuteReduction<T>(ReadOnlySpan<T> input, ReductionOperation operation) where T : unmanaged

Parameters

input ReadOnlySpan<T>

Input data.

operation ReductionOperation

Reduction operation.

Returns

T

Reduced result.

Type Parameters

T

Element type.

Execute<T>(KernelDefinition, ReadOnlySpan<T>, ReadOnlySpan<T>, Span<T>, long)

Executes a vectorized kernel with optimal SIMD utilization.

public void Execute<T>(KernelDefinition definition, ReadOnlySpan<T> input1, ReadOnlySpan<T> input2, Span<T> output, long elementCount) where T : unmanaged

Parameters

definition KernelDefinition

Kernel definition.

input1 ReadOnlySpan<T>

First input buffer.

input2 ReadOnlySpan<T>

Second input buffer.

output Span<T>

Output buffer.

elementCount long

Number of elements to process.

Type Parameters

T

Element type (must be unmanaged).

GetOperationMetrics(string)

Gets detailed performance metrics for a specific operation type.

public PerformanceMetricSnapshot? GetOperationMetrics(string operationType)

Parameters

operationType string

Type of operation to query.

Returns

PerformanceMetricSnapshot?

Performance metrics snapshot or null if not found.

GetPerformanceTrends()

Gets performance trends and optimization recommendations.

public PerformanceTrendAnalysis GetPerformanceTrends()

Returns

PerformanceTrendAnalysis

Performance trend analysis.

ResetStatistics()

Resets all performance counters and metrics.

public void ResetStatistics()