Class OptimizedSimdExecutor
- Namespace
- DotCompute.Backends.CPU.Kernels
- Assembly
- DotCompute.Backends.CPU.dll
Optimized SIMD kernel executor with advanced performance techniques:
- Instruction-level parallelism with multiple execution units
- Loop unrolling with optimal stride patterns
- Branch prediction optimization
- Cache-friendly memory access patterns
- Prefetch instructions for improved memory bandwidth
- Vectorized operations with fallback paths
- Runtime CPU feature detection and optimization Target: 4-8x performance improvement over scalar code
public sealed class OptimizedSimdExecutor : IDisposable
- Inheritance
-
OptimizedSimdExecutor
- Implements
- Inherited Members
- Extension Methods
Constructors
OptimizedSimdExecutor(ILogger<OptimizedSimdExecutor>, ExecutorConfiguration?)
Initializes a new optimized SIMD executor.
public OptimizedSimdExecutor(ILogger<OptimizedSimdExecutor> logger, ExecutorConfiguration? config = null)
Parameters
loggerILogger<OptimizedSimdExecutor>Logger for diagnostics.
configExecutorConfigurationExecutor configuration.
Properties
Statistics
Gets executor performance statistics.
public ExecutorStatistics Statistics { get; }
Property Value
Methods
AnalyzeWorkload<T>(long)
Analyzes workload characteristics for optimization planning.
public WorkloadProfile AnalyzeWorkload<T>(long elementCount) where T : unmanaged
Parameters
elementCountlongNumber of elements.
Returns
- WorkloadProfile
Workload analysis profile.
Type Parameters
TElement type.
Dispose()
Performs dispose.
public void Dispose()
EstimateCacheEfficiency(long, Type)
Estimates cache efficiency for a given workload.
public double EstimateCacheEfficiency(long elementCount, Type elementType)
Parameters
Returns
- double
Cache efficiency estimate (0.0 to 1.0).
EstimateVectorizationPotential<T>(long)
Estimates vectorization potential for a given workload.
public double EstimateVectorizationPotential<T>(long elementCount) where T : unmanaged
Parameters
elementCountlongNumber of elements.
Returns
- double
Vectorization potential (0.0 to 1.0).
Type Parameters
TElement type.
ExecuteReduction<T>(ReadOnlySpan<T>, ReductionOperation)
Executes a reduction operation with optimized SIMD reduction patterns.
public T ExecuteReduction<T>(ReadOnlySpan<T> input, ReductionOperation operation) where T : unmanaged
Parameters
inputReadOnlySpan<T>Input data.
operationReductionOperationReduction operation.
Returns
- T
Reduced result.
Type Parameters
TElement type.
Execute<T>(KernelDefinition, ReadOnlySpan<T>, ReadOnlySpan<T>, Span<T>, long)
Executes a vectorized kernel with optimal SIMD utilization.
public void Execute<T>(KernelDefinition definition, ReadOnlySpan<T> input1, ReadOnlySpan<T> input2, Span<T> output, long elementCount) where T : unmanaged
Parameters
definitionKernelDefinitionKernel definition.
input1ReadOnlySpan<T>First input buffer.
input2ReadOnlySpan<T>Second input buffer.
outputSpan<T>Output buffer.
elementCountlongNumber of elements to process.
Type Parameters
TElement type (must be unmanaged).
GetOperationMetrics(string)
Gets detailed performance metrics for a specific operation type.
public PerformanceMetricSnapshot? GetOperationMetrics(string operationType)
Parameters
operationTypestringType of operation to query.
Returns
- PerformanceMetricSnapshot?
Performance metrics snapshot or null if not found.
GetPerformanceTrends()
Gets performance trends and optimization recommendations.
public PerformanceTrendAnalysis GetPerformanceTrends()
Returns
- PerformanceTrendAnalysis
Performance trend analysis.
ResetStatistics()
Resets all performance counters and metrics.
public void ResetStatistics()