Class RingKernelLaunchOptions
- Namespace
- DotCompute.Abstractions.RingKernels
- Assembly
- DotCompute.Abstractions.dll
Configuration options for launching a ring kernel.
public sealed class RingKernelLaunchOptions
- Inheritance
-
RingKernelLaunchOptions
- Inherited Members
Remarks
Ring kernels are persistent GPU kernels that process messages from input queues and produce results in output queues. This class provides comprehensive configuration for queue sizing, deduplication, backpressure, and performance tuning.
Default Values
- QueueCapacity: 4096 messages (optimized for high throughput)
- DeduplicationWindowSize: 1024 messages (maximum validated size)
- BackpressureStrategy: Block (wait for space)
- EnablePriorityQueue: false (FIFO ordering)
Fields
DefaultDeduplicationWindowSize
Default deduplication window size (1024 messages).
public const int DefaultDeduplicationWindowSize = 1024
Field Value
Remarks
1024 is the maximum validated size that balances:
- Memory usage: ~32KB deduplication cache per queue
- Coverage: Detects duplicates within last 1024 messages
- Performance: O(1) lookup via hash table
DefaultQueueCapacity
Default queue capacity for ring kernel message queues (4096 messages).
public const int DefaultQueueCapacity = 4096
Field Value
Remarks
4096 provides a good balance between memory usage and throughput:
- Memory per queue: ~128KB for IRingKernelMessage types
- Supports 2M+ messages/s throughput with 100-500ns latency
- Power-of-2 for optimal modulo operations
Properties
BackpressureStrategy
Gets or sets the backpressure strategy when queues are full.
public BackpressureStrategy BackpressureStrategy { get; set; }
Property Value
- BackpressureStrategy
The backpressure strategy. Default is Block.
Remarks
Strategy Comparison
| Strategy | Behavior |
|---|---|
| Block | Wait for space (best for guaranteed delivery) |
| Reject | Return false immediately (best for latency-sensitive) |
| DropOldest | Overwrite oldest message (best for real-time streams) |
| DropNew | Discard new message (best for preserving historical data) |
Production Recommendation: Use Block for Orleans.GpuBridge to ensure actor requests are not lost during GPU computation.
DeduplicationWindowSize
Gets or sets the number of recent messages to check for duplicates.
public int DeduplicationWindowSize { get; set; }
Property Value
- int
The deduplication window size in messages. Default is 1024. Valid range: 16-1024 (enforced by MessageQueueOptions.Validate()).
Remarks
Deduplication Behavior
- Messages with duplicate MessageId within window are rejected
- Implemented via circular buffer hash table (O(1) lookup)
- Window size affects memory: ~32 bytes × window size per queue
Sizing Trade-offs
- Smaller window (16-256): Lower memory, faster duplicates may pass
- Larger window (512-1024): Higher memory, better duplicate detection
Note: Deduplication window size is clamped to QueueCapacity if QueueCapacity < 1024. For high-capacity queues (>1024), deduplication covers the most recent 1024 messages.
EnablePriorityQueue
Gets or sets whether to use priority-based message ordering.
public bool EnablePriorityQueue { get; set; }
Property Value
Remarks
Priority Queue Behavior
- Messages dequeued in priority order (0 = highest, 255 = lowest)
- Same-priority messages dequeued in FIFO order
- Slight performance overhead: ~10-20% vs FIFO
Use Cases
- Enable: Critical actor requests need priority over batch operations
- Disable: Uniform priority, maximize throughput
QueueCapacity
Gets or sets the maximum number of messages each queue can hold.
public int QueueCapacity { get; set; }
Property Value
- int
The queue capacity in messages. Default is 4096. Must be a power of 2 for optimal performance (16, 32, 64, ..., 65536).
Remarks
Sizing Guidelines
- Low latency (sub-microsecond): 256-1024
- Balanced (production default): 4096
- High throughput (batch processing): 16384-65536
Larger queues consume more memory but provide better burst handling. Memory usage: ~32 bytes × capacity for IRingKernelMessage types.
StreamPriority
Gets or sets the CUDA stream priority for Ring Kernel execution.
public RingKernelStreamPriority StreamPriority { get; set; }
Property Value
- RingKernelStreamPriority
The stream priority level. Default is Normal.
Remarks
Stream Priority Behavior
- High: GPU scheduler prioritizes this kernel for low-latency responses (use for critical operations)
- Normal: Default priority for typical workloads
- Low: Deprioritized for background processing that can tolerate higher latency
Use Cases
- High: Actor request processing, real-time data streams, latency-sensitive operations
- Normal: General purpose computation, balanced workloads
- Low: Batch processing, background analytics, non-critical tasks
Note: Stream priority affects GPU scheduling but does not guarantee execution order. Higher priority streams get preferential access to GPU resources when multiple streams compete.
Methods
HighThroughputDefaults()
Creates a new instance optimized for high-throughput batch processing.
public static RingKernelLaunchOptions HighThroughputDefaults()
Returns
- RingKernelLaunchOptions
A new RingKernelLaunchOptions with high-throughput defaults.
Remarks
High-Throughput Defaults
- QueueCapacity: 16384 (large burst buffer)
- DeduplicationWindowSize: 1024 (maximum window)
- BackpressureStrategy: Block (no loss)
- EnablePriorityQueue: false (maximize throughput)
Use for batch data processing where high memory usage is acceptable for throughput gains.
LowLatencyDefaults()
Creates a new instance optimized for low-latency scenarios (sub-microsecond).
public static RingKernelLaunchOptions LowLatencyDefaults()
Returns
- RingKernelLaunchOptions
A new RingKernelLaunchOptions with low-latency defaults.
Remarks
Low-Latency Defaults
- QueueCapacity: 256 (minimal memory footprint)
- DeduplicationWindowSize: 256 (proportional to capacity)
- BackpressureStrategy: Reject (fail-fast)
- EnablePriorityQueue: false (FIFO is fastest)
Use for latency-critical applications where queue full = temporary backoff is acceptable.
ProductionDefaults()
Creates a new instance with default values optimized for Orleans.GpuBridge production use.
public static RingKernelLaunchOptions ProductionDefaults()
Returns
- RingKernelLaunchOptions
A new RingKernelLaunchOptions with production defaults.
Remarks
Production Defaults
- QueueCapacity: 4096 (handles burst traffic, 2M+ msg/s)
- DeduplicationWindowSize: 1024 (covers recent messages)
- BackpressureStrategy: Block (no message loss)
- EnablePriorityQueue: false (maximize throughput)
These defaults are validated for: - 100-500ns latency targets - 2M+ messages/s throughput - Sub-10ms startup times - RTX 2000 Ada GPU (CC 8.9)
ToMessageQueueOptions()
Creates a MessageQueueOptions instance from these launch options.
public MessageQueueOptions ToMessageQueueOptions()
Returns
- MessageQueueOptions
A new MessageQueueOptions with values from this instance.
Remarks
This method is used internally by ring kernel runtimes to create message queues with the configured options. It ensures consistent translation from launch options to queue options.
Validate()
Validates the launch options and throws if any values are invalid.
public void Validate()
Remarks
This method performs comprehensive validation before kernel launch:
- Queue Capacity: 16 ≤ capacity ≤ 1M, power-of-2
- Deduplication Window: 16 ≤ window ≤ 1024
- Consistency: Window ≤ capacity (auto-clamped)
Note: DeduplicationWindowSize is automatically clamped to QueueCapacity if QueueCapacity < DeduplicationWindowSize. This ensures smaller queues have proportional deduplication windows.
Exceptions
- ArgumentOutOfRangeException
Thrown if:
- QueueCapacity is less than 16 or greater than 1048576 (1M)
- QueueCapacity is not a power of 2
- DeduplicationWindowSize is less than 16 or greater than 1024