Table of Contents

Enum OptimizationType

Namespace
DotCompute.Abstractions.Pipelines.Enums
Assembly
DotCompute.Abstractions.dll

Specifies the types of optimizations that can be applied to pipelines. This is a flags enum allowing combination of multiple optimization types.

[Flags]
public enum OptimizationType

Fields

Aggressive = KernelFusion | LoopOptimization | Inlining | MathOptimization | StageReordering

Aggressive optimizations that may require careful validation. Includes optimizations that could potentially affect correctness.

BackendSpecific = 16

Backend-specific optimizations for target compute devices. Leverages specific features of CUDA, CPU SIMD, or Metal backends.

BranchOptimization = 8192

Branch prediction and conditional optimization. Improves performance for code with conditional branches.

CacheOptimization = 16384

Cache optimization for better data locality. Includes cache-aware tiling and prefetching strategies.

Comprehensive = MemoryAccess | DataLayout | Aggressive | BranchOptimization | Conservative | PerformanceFocused | ParallelMerging

Comprehensive optimization applying all applicable optimizations. Includes all optimization types that are safe and beneficial.

Conservative = SizeFocused | CacheOptimization

Conservative optimizations that are safe and unlikely to cause issues. Includes only optimizations with minimal risk of correctness problems.

ConstantFolding = 256

Constant folding and propagation optimizations. Evaluates constant expressions at compile time.

DataLayout = 32

Data layout optimization for better memory access patterns. Includes structure-of-arrays vs array-of-structures transformations.

DeadCodeElimination = 128

Dead code elimination and unused variable removal. Reduces memory usage and improves cache efficiency.

Inlining = 512

Function inlining for reduced call overhead. Eliminates function call costs for small, frequently used functions.

InstructionScheduling = 64

Instruction scheduling and register allocation optimizations. Improves instruction throughput and reduces register pressure.

KernelFusion = 1

Kernel fusion to combine multiple kernels into a single execution unit. Reduces memory transfers and improves cache locality.

LoopOptimization = 4

Loop optimizations including unrolling, vectorization, and blocking. Improves instruction-level parallelism and memory locality.

MathOptimization = 1024

Mathematical expression simplification and strength reduction. Replaces expensive operations with cheaper equivalents.

MemoryAccess = 2

Memory access pattern optimization for better cache performance. Includes data layout transformations and memory coalescing.

MemoryFocused = MemoryAccess | DataLayout | MemoryPooling | CacheOptimization

Memory-focused optimizations for reducing memory usage and improving access patterns.

MemoryPooling = 2048

Memory pooling and allocation optimization. Reduces allocation overhead and memory fragmentation.

None = 0

No optimizations applied.

ParallelMerging = 65536

Parallel merging optimization for combining parallel execution stages. Merges independent parallel operations to improve resource utilization.

Parallelization = 8

Parallelization and concurrency optimizations. Identifies opportunities for parallel execution and load balancing.

PerformanceFocused = Parallelization | BackendSpecific | InstructionScheduling | Vectorization

Performance-focused optimizations for maximum execution speed.

SizeFocused = DeadCodeElimination | ConstantFolding | MemoryPooling

Size-focused optimizations for reducing memory footprint.

StageReordering = 4096

Pipeline stage reordering for better resource utilization. Minimizes idle time and maximizes throughput.

Vectorization = 32768

Vectorization for SIMD instruction utilization. Leverages vector processing units for data-parallel operations.