Class KernelCompilationOptions
- Namespace
- Orleans.GpuBridge.Abstractions.Models.Compilation
- Assembly
- Orleans.GpuBridge.Abstractions.dll
Represents compilation options for GPU kernel compilation.
public sealed record KernelCompilationOptions : IEquatable<KernelCompilationOptions>
- Inheritance
-
KernelCompilationOptions
- Implements
- Inherited Members
Examples
// Basic optimization options
var options = new KernelCompilationOptions(
OptimizationLevel: OptimizationLevel.O3,
EnableFastMath: true);
// Debug-enabled options
var debugOptions = new KernelCompilationOptions(
OptimizationLevel: OptimizationLevel.O0,
EnableDebugInfo: true,
EnableProfiling: true);
// Architecture-specific options
var targetedOptions = new KernelCompilationOptions(
TargetArchitecture: "sm_80",
MaxRegisterCount: 64,
MinBlockSize: 128);
Remarks
This record provides comprehensive control over kernel compilation behavior. Different GPU backends may interpret these options differently, and not all options may be supported by every backend. Unsupported options are typically ignored with a warning.
For optimal performance, consider the following guidelines: - Use O2 or higher for production kernels - Enable debug info only during development - Set MaxRegisterCount when optimizing for high occupancy - Specify TargetArchitecture for deployment-specific optimizations
Constructors
KernelCompilationOptions(OptimizationLevel, bool, bool, bool, int, int, string?, IReadOnlyDictionary<string, string>?, IReadOnlyDictionary<string, object>?, object?)
Represents compilation options for GPU kernel compilation.
public KernelCompilationOptions(OptimizationLevel OptimizationLevel = OptimizationLevel.O2, bool EnableDebugInfo = false, bool EnableProfiling = false, bool EnableFastMath = true, int MaxRegisterCount = 0, int MinBlockSize = 0, string? TargetArchitecture = null, IReadOnlyDictionary<string, string>? Defines = null, IReadOnlyDictionary<string, object>? CustomOptions = null, object? TargetDevice = null)
Parameters
OptimizationLevelOptimizationLevelThe optimization level to apply during compilation. Higher levels produce more optimized code but may increase compilation time. Default is O2.
EnableDebugInfoboolIndicates whether to include debug information in the compiled kernel. When enabled, allows for better debugging and profiling capabilities but increases binary size. Default is
false.EnableProfilingboolIndicates whether to enable profiling support in the compiled kernel. When enabled, the kernel can provide detailed execution metrics and timing information. May impact performance. Default is
false.EnableFastMathboolIndicates whether to enable fast math optimizations that may sacrifice numerical precision for improved performance. Useful for applications where approximate results are acceptable. Default is
true.MaxRegisterCountintThe maximum number of registers the kernel is allowed to use. A value of 0 indicates no explicit limit, allowing the compiler to use as many registers as needed. Non-zero values can improve occupancy by limiting register usage per thread. Default is 0.
MinBlockSizeintThe minimum block size (number of threads per block) to optimize for. A value of 0 indicates no minimum, allowing the compiler to choose optimal block sizes. Specifying a minimum helps optimize for specific execution patterns. Default is 0.
TargetArchitecturestringThe target GPU architecture to compile for (e.g., "sm_75", "gfx906"). A null value indicates compilation for the default or detected architecture. Specifying an architecture enables architecture-specific optimizations. Default is
null.DefinesIReadOnlyDictionary<string, string>Preprocessor definitions to pass to the compiler as key-value pairs. These definitions are equivalent to #define directives and can be used to conditionally compile code sections. Keys are definition names, values are definition values (empty string for flag-only definitions). Default is
null.CustomOptionsIReadOnlyDictionary<string, object>Additional custom compilation options specific to the backend compiler. This dictionary allows passing backend-specific flags and options that are not covered by the standard options. Keys should be option names, values should be option values. Default is
null.TargetDeviceobjectThe specific target device for compilation optimization. When specified, the compiler can optimize for device-specific features and characteristics. A null value uses default device selection. Default is
null.
Examples
// Basic optimization options
var options = new KernelCompilationOptions(
OptimizationLevel: OptimizationLevel.O3,
EnableFastMath: true);
// Debug-enabled options
var debugOptions = new KernelCompilationOptions(
OptimizationLevel: OptimizationLevel.O0,
EnableDebugInfo: true,
EnableProfiling: true);
// Architecture-specific options
var targetedOptions = new KernelCompilationOptions(
TargetArchitecture: "sm_80",
MaxRegisterCount: 64,
MinBlockSize: 128);
Remarks
This record provides comprehensive control over kernel compilation behavior. Different GPU backends may interpret these options differently, and not all options may be supported by every backend. Unsupported options are typically ignored with a warning.
For optimal performance, consider the following guidelines: - Use O2 or higher for production kernels - Enable debug info only during development - Set MaxRegisterCount when optimizing for high occupancy - Specify TargetArchitecture for deployment-specific optimizations
Properties
CustomOptions
Additional custom compilation options specific to the backend compiler.
This dictionary allows passing backend-specific flags and options that
are not covered by the standard options. Keys should be option names,
values should be option values. Default is null.
public IReadOnlyDictionary<string, object>? CustomOptions { get; init; }
Property Value
Defines
Preprocessor definitions to pass to the compiler as key-value pairs.
These definitions are equivalent to #define directives and can be used
to conditionally compile code sections. Keys are definition names,
values are definition values (empty string for flag-only definitions).
Default is null.
public IReadOnlyDictionary<string, string>? Defines { get; init; }
Property Value
EnableDebugInfo
Indicates whether to include debug information in the compiled kernel.
When enabled, allows for better debugging and profiling capabilities but
increases binary size. Default is false.
public bool EnableDebugInfo { get; init; }
Property Value
EnableFastMath
Indicates whether to enable fast math optimizations that may sacrifice
numerical precision for improved performance. Useful for applications
where approximate results are acceptable. Default is true.
public bool EnableFastMath { get; init; }
Property Value
EnableProfiling
Indicates whether to enable profiling support in the compiled kernel.
When enabled, the kernel can provide detailed execution metrics and timing
information. May impact performance. Default is false.
public bool EnableProfiling { get; init; }
Property Value
MaxRegisterCount
The maximum number of registers the kernel is allowed to use. A value of 0 indicates no explicit limit, allowing the compiler to use as many registers as needed. Non-zero values can improve occupancy by limiting register usage per thread. Default is 0.
public int MaxRegisterCount { get; init; }
Property Value
MinBlockSize
The minimum block size (number of threads per block) to optimize for. A value of 0 indicates no minimum, allowing the compiler to choose optimal block sizes. Specifying a minimum helps optimize for specific execution patterns. Default is 0.
public int MinBlockSize { get; init; }
Property Value
OptimizationLevel
The optimization level to apply during compilation. Higher levels produce more optimized code but may increase compilation time. Default is O2.
public OptimizationLevel OptimizationLevel { get; init; }
Property Value
TargetArchitecture
The target GPU architecture to compile for (e.g., "sm_75", "gfx906").
A null value indicates compilation for the default or detected architecture.
Specifying an architecture enables architecture-specific optimizations.
Default is null.
public string? TargetArchitecture { get; init; }
Property Value
TargetDevice
The specific target device for compilation optimization.
When specified, the compiler can optimize for device-specific features
and characteristics. A null value uses default device selection.
Default is null.
public object? TargetDevice { get; init; }