Table of Contents

Class KernelCompilationOptions

Namespace
Orleans.GpuBridge.Abstractions.Models.Compilation
Assembly
Orleans.GpuBridge.Abstractions.dll

Represents compilation options for GPU kernel compilation.

public sealed record KernelCompilationOptions : IEquatable<KernelCompilationOptions>
Inheritance
KernelCompilationOptions
Implements
Inherited Members

Examples

// Basic optimization options
var options = new KernelCompilationOptions(
    OptimizationLevel: OptimizationLevel.O3,
    EnableFastMath: true);

// Debug-enabled options
var debugOptions = new KernelCompilationOptions(
    OptimizationLevel: OptimizationLevel.O0,
    EnableDebugInfo: true,
    EnableProfiling: true);

// Architecture-specific options
var targetedOptions = new KernelCompilationOptions(
    TargetArchitecture: "sm_80",
    MaxRegisterCount: 64,
    MinBlockSize: 128);

Remarks

This record provides comprehensive control over kernel compilation behavior. Different GPU backends may interpret these options differently, and not all options may be supported by every backend. Unsupported options are typically ignored with a warning.

For optimal performance, consider the following guidelines: - Use O2 or higher for production kernels - Enable debug info only during development - Set MaxRegisterCount when optimizing for high occupancy - Specify TargetArchitecture for deployment-specific optimizations

Constructors

KernelCompilationOptions(OptimizationLevel, bool, bool, bool, int, int, string?, IReadOnlyDictionary<string, string>?, IReadOnlyDictionary<string, object>?, object?)

Represents compilation options for GPU kernel compilation.

public KernelCompilationOptions(OptimizationLevel OptimizationLevel = OptimizationLevel.O2, bool EnableDebugInfo = false, bool EnableProfiling = false, bool EnableFastMath = true, int MaxRegisterCount = 0, int MinBlockSize = 0, string? TargetArchitecture = null, IReadOnlyDictionary<string, string>? Defines = null, IReadOnlyDictionary<string, object>? CustomOptions = null, object? TargetDevice = null)

Parameters

OptimizationLevel OptimizationLevel

The optimization level to apply during compilation. Higher levels produce more optimized code but may increase compilation time. Default is O2.

EnableDebugInfo bool

Indicates whether to include debug information in the compiled kernel. When enabled, allows for better debugging and profiling capabilities but increases binary size. Default is false.

EnableProfiling bool

Indicates whether to enable profiling support in the compiled kernel. When enabled, the kernel can provide detailed execution metrics and timing information. May impact performance. Default is false.

EnableFastMath bool

Indicates whether to enable fast math optimizations that may sacrifice numerical precision for improved performance. Useful for applications where approximate results are acceptable. Default is true.

MaxRegisterCount int

The maximum number of registers the kernel is allowed to use. A value of 0 indicates no explicit limit, allowing the compiler to use as many registers as needed. Non-zero values can improve occupancy by limiting register usage per thread. Default is 0.

MinBlockSize int

The minimum block size (number of threads per block) to optimize for. A value of 0 indicates no minimum, allowing the compiler to choose optimal block sizes. Specifying a minimum helps optimize for specific execution patterns. Default is 0.

TargetArchitecture string

The target GPU architecture to compile for (e.g., "sm_75", "gfx906"). A null value indicates compilation for the default or detected architecture. Specifying an architecture enables architecture-specific optimizations. Default is null.

Defines IReadOnlyDictionary<string, string>

Preprocessor definitions to pass to the compiler as key-value pairs. These definitions are equivalent to #define directives and can be used to conditionally compile code sections. Keys are definition names, values are definition values (empty string for flag-only definitions). Default is null.

CustomOptions IReadOnlyDictionary<string, object>

Additional custom compilation options specific to the backend compiler. This dictionary allows passing backend-specific flags and options that are not covered by the standard options. Keys should be option names, values should be option values. Default is null.

TargetDevice object

The specific target device for compilation optimization. When specified, the compiler can optimize for device-specific features and characteristics. A null value uses default device selection. Default is null.

Examples

// Basic optimization options
var options = new KernelCompilationOptions(
    OptimizationLevel: OptimizationLevel.O3,
    EnableFastMath: true);

// Debug-enabled options
var debugOptions = new KernelCompilationOptions(
    OptimizationLevel: OptimizationLevel.O0,
    EnableDebugInfo: true,
    EnableProfiling: true);

// Architecture-specific options
var targetedOptions = new KernelCompilationOptions(
    TargetArchitecture: "sm_80",
    MaxRegisterCount: 64,
    MinBlockSize: 128);

Remarks

This record provides comprehensive control over kernel compilation behavior. Different GPU backends may interpret these options differently, and not all options may be supported by every backend. Unsupported options are typically ignored with a warning.

For optimal performance, consider the following guidelines: - Use O2 or higher for production kernels - Enable debug info only during development - Set MaxRegisterCount when optimizing for high occupancy - Specify TargetArchitecture for deployment-specific optimizations

Properties

CustomOptions

Additional custom compilation options specific to the backend compiler. This dictionary allows passing backend-specific flags and options that are not covered by the standard options. Keys should be option names, values should be option values. Default is null.

public IReadOnlyDictionary<string, object>? CustomOptions { get; init; }

Property Value

IReadOnlyDictionary<string, object>

Defines

Preprocessor definitions to pass to the compiler as key-value pairs. These definitions are equivalent to #define directives and can be used to conditionally compile code sections. Keys are definition names, values are definition values (empty string for flag-only definitions). Default is null.

public IReadOnlyDictionary<string, string>? Defines { get; init; }

Property Value

IReadOnlyDictionary<string, string>

EnableDebugInfo

Indicates whether to include debug information in the compiled kernel. When enabled, allows for better debugging and profiling capabilities but increases binary size. Default is false.

public bool EnableDebugInfo { get; init; }

Property Value

bool

EnableFastMath

Indicates whether to enable fast math optimizations that may sacrifice numerical precision for improved performance. Useful for applications where approximate results are acceptable. Default is true.

public bool EnableFastMath { get; init; }

Property Value

bool

EnableProfiling

Indicates whether to enable profiling support in the compiled kernel. When enabled, the kernel can provide detailed execution metrics and timing information. May impact performance. Default is false.

public bool EnableProfiling { get; init; }

Property Value

bool

MaxRegisterCount

The maximum number of registers the kernel is allowed to use. A value of 0 indicates no explicit limit, allowing the compiler to use as many registers as needed. Non-zero values can improve occupancy by limiting register usage per thread. Default is 0.

public int MaxRegisterCount { get; init; }

Property Value

int

MinBlockSize

The minimum block size (number of threads per block) to optimize for. A value of 0 indicates no minimum, allowing the compiler to choose optimal block sizes. Specifying a minimum helps optimize for specific execution patterns. Default is 0.

public int MinBlockSize { get; init; }

Property Value

int

OptimizationLevel

The optimization level to apply during compilation. Higher levels produce more optimized code but may increase compilation time. Default is O2.

public OptimizationLevel OptimizationLevel { get; init; }

Property Value

OptimizationLevel

TargetArchitecture

The target GPU architecture to compile for (e.g., "sm_75", "gfx906"). A null value indicates compilation for the default or detected architecture. Specifying an architecture enables architecture-specific optimizations. Default is null.

public string? TargetArchitecture { get; init; }

Property Value

string

TargetDevice

The specific target device for compilation optimization. When specified, the compiler can optimize for device-specific features and characteristics. A null value uses default device selection. Default is null.

public object? TargetDevice { get; init; }

Property Value

object