Struct MultiKernelBarrier
- Namespace
- DotCompute.Backends.CUDA.RingKernels
- Assembly
- DotCompute.Backends.CUDA.dll
Multi-kernel barrier for synchronizing persistent Ring Kernels across GPU.
public struct MultiKernelBarrier : IEquatable<MultiKernelBarrier>
- Implements
- Inherited Members
Remarks
Enables coordination of multiple Ring Kernels through generation-based barrier protocol. Kernels wait at barrier until all participants arrive, then proceed to next generation.
Memory Layout (16 bytes, 4-byte aligned): - ParticipantCount: 4 bytes (number of kernels) - ArrivedCount: 4 bytes (atomic counter) - Generation: 4 bytes (barrier generation number) - Flags: 4 bytes (barrier state flags)
Protocol Phases: 1. Arrival: Each kernel atomically increments arrived counter 2. Wait: Kernels spin until generation changes (all arrived) 3. Departure: Last kernel resets counter and increments generation
Barrier Scopes: - Thread-Block: `__syncthreads()` for all threads in block (~10ns) - Grid-Wide: Cooperative groups for all blocks in kernel (~1-10μs) - Multi-Kernel: This struct for all participating kernels (~10-100μs)
Fields
ArrivedCount
Atomic counter for arrived kernels (0 to ParticipantCount).
public int ArrivedCount
Field Value
Remarks
Atomically incremented by each arriving kernel.
Reset to 0 by last arriving kernel.
Modified using atomicAdd() in CUDA device code.
FlagActive
Flag: Barrier is active and in use.
public const int FlagActive = 1
Field Value
FlagFailed
Flag: Barrier operation failed (participant crashed or disconnected).
public const int FlagFailed = 4
Field Value
FlagTimeout
Flag: Barrier operation timed out.
public const int FlagTimeout = 2
Field Value
Flags
Barrier state flags.
public int Flags
Field Value
Remarks
- Bit 0: Active flag (barrier is in use)
- Bit 1: Timeout flag (barrier timed out)
- Bit 2: Failed flag (barrier failed)
- Bits 3-31: Reserved for future use
Generation
Barrier generation number (incremented after each barrier completion).
public int Generation
Field Value
Remarks
Kernels wait for generation change to detect barrier completion.
Prevents ABA problem in wait loops.
Wraps around at int.MaxValue (2.1 billion barriers).
Modified using atomicAdd() in CUDA device code.
ParticipantCount
Number of kernels that must arrive at barrier.
public int ParticipantCount
Field Value
Remarks
Valid range: 1-65535 kernels. Must be set before barrier use and remain constant during barrier lifetime.
Methods
Create(int)
Creates a multi-kernel barrier configured for specified participant count.
public static MultiKernelBarrier Create(int participantCount)
Parameters
participantCountintNumber of kernels that must arrive at barrier (1-65535).
Returns
- MultiKernelBarrier
Initialized barrier ready for GPU use.
Exceptions
- ArgumentOutOfRangeException
Thrown if participantCount is out of valid range.
CreateEmpty()
Creates an uninitialized multi-kernel barrier (all fields zero).
public static MultiKernelBarrier CreateEmpty()
Returns
- MultiKernelBarrier
Empty barrier suitable for GPU allocation.
Equals(MultiKernelBarrier)
Indicates whether the current object is equal to another object of the same type.
public readonly bool Equals(MultiKernelBarrier other)
Parameters
otherMultiKernelBarrierAn object to compare with this object.
Returns
Equals(object?)
Indicates whether this instance and a specified object are equal.
public override readonly bool Equals(object? obj)
Parameters
objobjectThe object to compare with the current instance.
Returns
- bool
true if
objand this instance are the same type and represent the same value; otherwise, false.
GetHashCode()
Returns the hash code for this instance.
public override readonly int GetHashCode()
Returns
- int
A 32-bit signed integer that is the hash code for this instance.
IsActive()
Checks if barrier is active.
public readonly bool IsActive()
Returns
- bool
True if active flag is set, false otherwise.
IsFailed()
Checks if barrier has failed.
public readonly bool IsFailed()
Returns
- bool
True if failed flag is set, false otherwise.
IsTimedOut()
Checks if barrier has timed out.
public readonly bool IsTimedOut()
Returns
- bool
True if timeout flag is set, false otherwise.
Validate()
Validates the barrier structure for correctness.
public readonly bool Validate()
Returns
- bool
True if valid, false if any invariant is violated.
Remarks
Checks:
- ParticipantCount in valid range [1, 65535]
- ArrivedCount in valid range [0, ParticipantCount]
- Generation non-negative
Operators
operator ==(MultiKernelBarrier, MultiKernelBarrier)
Equality operator.
public static bool operator ==(MultiKernelBarrier left, MultiKernelBarrier right)
Parameters
leftMultiKernelBarrierrightMultiKernelBarrier
Returns
operator !=(MultiKernelBarrier, MultiKernelBarrier)
Inequality operator.
public static bool operator !=(MultiKernelBarrier left, MultiKernelBarrier right)
Parameters
leftMultiKernelBarrierrightMultiKernelBarrier