Table of Contents

Struct TaskQueue

Namespace
DotCompute.Backends.CUDA.RingKernels
Assembly
DotCompute.Backends.CUDA.dll

Lock-free task queue with work-stealing support for dynamic load balancing.

public struct TaskQueue : IEquatable<TaskQueue>
Implements
Inherited Members

Remarks

Implements the Chase-Lev work-stealing deque algorithm for efficient task distribution. Owner kernel operates on head (push/pop), thief kernels steal from tail.

Memory Layout (32 bytes, 8-byte aligned): - Head: 8 bytes (atomic, owner-side push/pop) - Tail: 8 bytes (atomic, thief-side steal) - Capacity: 4 bytes (power of 2) - TasksPtr: 8 bytes (device pointer to task array) - OwnerId: 4 bytes (owner kernel ID) - Flags: 4 bytes (queue state flags)

Work-Stealing Protocol: 1. Idle kernel selects random victim 2. Reads victim's tail and head atomically 3. Calculates queue size (head - tail) 4. Steals up to half of victim's tasks 5. Atomically increments victim's tail 6. Copies stolen tasks to own queue 7. On race condition: returns stolen slots and retries

Performance Characteristics: - Owner operations (push/pop): O(1), ~50ns - Steal operations: O(k) where k = stolen count, ~500ns for 100 tasks - Steal success rate: 70-90% under typical load - Throughput: 10M+ operations/sec on modern GPUs

Fields

Capacity

Queue capacity in tasks (must be power of 2).

public int Capacity

Field Value

int

Remarks

Valid range: 16-65536 tasks. Power of 2 enables efficient modulo via bitwise AND. Recommended: 256-4096 tasks for typical workloads.

FlagActive

Flag: Queue is active and accepting tasks.

public const uint FlagActive = 1

Field Value

uint

FlagFull

Flag: Queue is full (head - tail >= capacity).

public const uint FlagFull = 4

Field Value

uint

FlagStealingEnabled

Flag: Work-stealing is enabled for this queue.

public const uint FlagStealingEnabled = 2

Field Value

uint

Flags

Queue state flags.

public uint Flags

Field Value

uint

Remarks

  • Bit 0: Active flag (queue accepting tasks)
  • Bit 1: Stealing enabled flag
  • Bit 2: Full flag (queue at capacity)
  • Bits 3-31: Reserved

Head

Atomic head pointer (owner-side push/pop index).

public long Head

Field Value

long

Remarks

Incremented on push, decremented on pop. Modified only by owner kernel. 64-bit to prevent overflow (wraps at long.MaxValue).

OwnerId

Owner kernel ID (16-bit hash).

public uint OwnerId

Field Value

uint

Remarks

Identifies kernel that owns this queue. Used for affinity hints and debugging.

Tail

Atomic tail pointer (thief-side steal index).

public long Tail

Field Value

long

Remarks

Incremented by thieves during steal operations. 64-bit to prevent overflow and ABA problems.

TasksPtr

Device pointer to task descriptor array.

public long TasksPtr

Field Value

long

Remarks

Array size: Capacity * sizeof(TaskDescriptor) bytes. Must be allocated in GPU memory before queue use.

Properties

Size

Gets the current queue size (number of tasks).

public readonly long Size { get; }

Property Value

long

Queue size (0 to Capacity).

Methods

Create(uint, int, long, bool)

Creates a task queue configured for specified owner and capacity.

public static TaskQueue Create(uint ownerId, int capacity, long tasksPtr, bool enableStealing = true)

Parameters

ownerId uint

Owner kernel ID (16-bit hash).

capacity int

Queue capacity (must be power of 2, 16-65536).

tasksPtr long

Device pointer to pre-allocated task array.

enableStealing bool

Enable work-stealing for this queue.

Returns

TaskQueue

Initialized queue ready for GPU use.

Exceptions

ArgumentOutOfRangeException

Thrown if capacity is not power of 2 or out of range.

ArgumentException

Thrown if tasksPtr is zero.

CreateEmpty()

Creates an uninitialized task queue (all fields zero).

public static TaskQueue CreateEmpty()

Returns

TaskQueue

Empty queue suitable for GPU allocation.

Equals(TaskQueue)

Indicates whether the current object is equal to another object of the same type.

public readonly bool Equals(TaskQueue other)

Parameters

other TaskQueue

An object to compare with this object.

Returns

bool

true if the current object is equal to the other parameter; otherwise, false.

Equals(object?)

Indicates whether this instance and a specified object are equal.

public override readonly bool Equals(object? obj)

Parameters

obj object

The object to compare with the current instance.

Returns

bool

true if obj and this instance are the same type and represent the same value; otherwise, false.

GetHashCode()

Returns the hash code for this instance.

public override readonly int GetHashCode()

Returns

int

A 32-bit signed integer that is the hash code for this instance.

IsEmpty()

Checks if queue is empty.

public readonly bool IsEmpty()

Returns

bool

True if queue is empty (head == tail).

IsFull()

Checks if queue is full.

public readonly bool IsFull()

Returns

bool

True if queue size >= capacity.

Validate()

Validates the task queue structure for correctness.

public readonly bool Validate()

Returns

bool

True if valid, false if any invariant is violated.

Remarks

Checks:

  • Capacity is power of 2 in valid range
  • TasksPtr is non-zero (if capacity > 0)
  • Head >= Tail (invariant of Chase-Lev deque)

Operators

operator ==(TaskQueue, TaskQueue)

Equality operator.

public static bool operator ==(TaskQueue left, TaskQueue right)

Parameters

left TaskQueue
right TaskQueue

Returns

bool

operator !=(TaskQueue, TaskQueue)

Inequality operator.

public static bool operator !=(TaskQueue left, TaskQueue right)

Parameters

left TaskQueue
right TaskQueue

Returns

bool