Table of Contents

Struct MultiKernelBarrier

Namespace
DotCompute.Backends.CUDA.RingKernels
Assembly
DotCompute.Backends.CUDA.dll

Multi-kernel barrier for synchronizing persistent Ring Kernels across GPU.

public struct MultiKernelBarrier : IEquatable<MultiKernelBarrier>
Implements
Inherited Members

Remarks

Enables coordination of multiple Ring Kernels through generation-based barrier protocol. Kernels wait at barrier until all participants arrive, then proceed to next generation.

Memory Layout (16 bytes, 4-byte aligned): - ParticipantCount: 4 bytes (number of kernels) - ArrivedCount: 4 bytes (atomic counter) - Generation: 4 bytes (barrier generation number) - Flags: 4 bytes (barrier state flags)

Protocol Phases: 1. Arrival: Each kernel atomically increments arrived counter 2. Wait: Kernels spin until generation changes (all arrived) 3. Departure: Last kernel resets counter and increments generation

Barrier Scopes: - Thread-Block: `__syncthreads()` for all threads in block (~10ns) - Grid-Wide: Cooperative groups for all blocks in kernel (~1-10μs) - Multi-Kernel: This struct for all participating kernels (~10-100μs)

Fields

ArrivedCount

Atomic counter for arrived kernels (0 to ParticipantCount).

public int ArrivedCount

Field Value

int

Remarks

Atomically incremented by each arriving kernel. Reset to 0 by last arriving kernel. Modified using atomicAdd() in CUDA device code.

FlagActive

Flag: Barrier is active and in use.

public const int FlagActive = 1

Field Value

int

FlagFailed

Flag: Barrier operation failed (participant crashed or disconnected).

public const int FlagFailed = 4

Field Value

int

FlagTimeout

Flag: Barrier operation timed out.

public const int FlagTimeout = 2

Field Value

int

Flags

Barrier state flags.

public int Flags

Field Value

int

Remarks

  • Bit 0: Active flag (barrier is in use)
  • Bit 1: Timeout flag (barrier timed out)
  • Bit 2: Failed flag (barrier failed)
  • Bits 3-31: Reserved for future use
Modified using `atomicOr()` in CUDA device code.

Generation

Barrier generation number (incremented after each barrier completion).

public int Generation

Field Value

int

Remarks

Kernels wait for generation change to detect barrier completion. Prevents ABA problem in wait loops. Wraps around at int.MaxValue (2.1 billion barriers). Modified using atomicAdd() in CUDA device code.

ParticipantCount

Number of kernels that must arrive at barrier.

public int ParticipantCount

Field Value

int

Remarks

Valid range: 1-65535 kernels. Must be set before barrier use and remain constant during barrier lifetime.

Methods

Create(int)

Creates a multi-kernel barrier configured for specified participant count.

public static MultiKernelBarrier Create(int participantCount)

Parameters

participantCount int

Number of kernels that must arrive at barrier (1-65535).

Returns

MultiKernelBarrier

Initialized barrier ready for GPU use.

Exceptions

ArgumentOutOfRangeException

Thrown if participantCount is out of valid range.

CreateEmpty()

Creates an uninitialized multi-kernel barrier (all fields zero).

public static MultiKernelBarrier CreateEmpty()

Returns

MultiKernelBarrier

Empty barrier suitable for GPU allocation.

Equals(MultiKernelBarrier)

Indicates whether the current object is equal to another object of the same type.

public readonly bool Equals(MultiKernelBarrier other)

Parameters

other MultiKernelBarrier

An object to compare with this object.

Returns

bool

true if the current object is equal to the other parameter; otherwise, false.

Equals(object?)

Indicates whether this instance and a specified object are equal.

public override readonly bool Equals(object? obj)

Parameters

obj object

The object to compare with the current instance.

Returns

bool

true if obj and this instance are the same type and represent the same value; otherwise, false.

GetHashCode()

Returns the hash code for this instance.

public override readonly int GetHashCode()

Returns

int

A 32-bit signed integer that is the hash code for this instance.

IsActive()

Checks if barrier is active.

public readonly bool IsActive()

Returns

bool

True if active flag is set, false otherwise.

IsFailed()

Checks if barrier has failed.

public readonly bool IsFailed()

Returns

bool

True if failed flag is set, false otherwise.

IsTimedOut()

Checks if barrier has timed out.

public readonly bool IsTimedOut()

Returns

bool

True if timeout flag is set, false otherwise.

Validate()

Validates the barrier structure for correctness.

public readonly bool Validate()

Returns

bool

True if valid, false if any invariant is violated.

Remarks

Checks:

  • ParticipantCount in valid range [1, 65535]
  • ArrivedCount in valid range [0, ParticipantCount]
  • Generation non-negative

Operators

operator ==(MultiKernelBarrier, MultiKernelBarrier)

Equality operator.

public static bool operator ==(MultiKernelBarrier left, MultiKernelBarrier right)

Parameters

left MultiKernelBarrier
right MultiKernelBarrier

Returns

bool

operator !=(MultiKernelBarrier, MultiKernelBarrier)

Inequality operator.

public static bool operator !=(MultiKernelBarrier left, MultiKernelBarrier right)

Parameters

left MultiKernelBarrier
right MultiKernelBarrier

Returns

bool