Interface IBarrierHandle
- Namespace
- DotCompute.Abstractions.Barriers
- Assembly
- DotCompute.Abstractions.dll
Represents a GPU barrier synchronization primitive for coordinating thread execution.
public interface IBarrierHandle : IDisposable
- Inherited Members
- Extension Methods
Remarks
A barrier handle provides a typed reference to a GPU synchronization barrier, enabling thread groups to coordinate execution and ensure memory consistency. Barriers are critical for algorithms requiring phased computation where all threads must complete one phase before any thread can proceed to the next.
Thread Safety: Barrier handles are thread-safe and can be used by multiple threads simultaneously. However, the barrier itself has specific synchronization semantics that must be respected (see Sync() method).
Lifetime: Barriers remain valid until explicitly disposed. For grid-level barriers, all participating threads must complete before disposal.
Usage Example:
var barrier = provider.CreateBarrier(BarrierScope.ThreadBlock, capacity: 256);
// In kernel: barrier.Sync(); // All 256 threads wait here
barrier.Dispose(); // Cleanup when done
Properties
BarrierId
Gets the unique identifier for this barrier.
int BarrierId { get; }
Property Value
Remarks
Barrier IDs are unique within the lifetime of the barrier provider. Multiple barriers can exist simultaneously, each with a distinct ID.
ID Assignment:
- ThreadBlock barriers: ID corresponds to barrier register index (0-15)
- Grid barriers: ID is runtime-generated unique identifier
- Named barriers: ID matches user-specified name hash
Capacity
Gets the maximum number of threads that can synchronize on this barrier.
int Capacity { get; }
Property Value
Remarks
Capacity constraints vary by scope:
| ThreadBlock | ≤ block size (typically 1024 max) |
| Grid | ≤ grid size (all threads in kernel) |
| Warp | Fixed at 32 (warp size) |
| Tile | ≤ block size, user-specified |
Important: Exactly Capacity threads must call Sync() for the barrier to proceed. Calling sync with fewer threads causes deadlock.
IsActive
Gets whether this barrier is currently active (has waiting threads).
bool IsActive { get; }
Property Value
Remarks
A barrier is active when 0 < ThreadsWaiting < Capacity. Active barriers should not be disposed or reconfigured.
Scope
Gets the synchronization scope of this barrier.
BarrierScope Scope { get; }
Property Value
Remarks
The scope determines which threads participate in synchronization: ThreadBlock, Grid, Warp, or Tile.
- See Also
ThreadsWaiting
Gets the current number of threads waiting at the barrier.
int ThreadsWaiting { get; }
Property Value
Remarks
This property enables monitoring barrier utilization and detecting deadlocks. Reading this value requires synchronization with the GPU and may introduce overhead (~1μs).
States:
- 0: No threads waiting (idle barrier)
- 0 < n < Capacity: Some threads blocked, waiting for others
- n = Capacity: All threads arrived, barrier about to release
Debugging: If ThreadsWaiting remains < Capacity indefinitely,
a thread may have missed the barrier call (deadlock condition).
Methods
Reset()
Resets the barrier to its initial state, clearing waiting threads.
void Reset()
Remarks
⚠️ WARNING: Resetting an active barrier (with waiting threads) can cause deadlock or data races. Only reset when IsActive is false.
Reset is typically used between kernel launches or when reinitializing a barrier for a new computation phase.
Exceptions
- InvalidOperationException
Thrown when barrier is currently active (IsActive is true).
Sync()
Synchronizes threads at the barrier, blocking until all threads arrive.
void Sync()
Examples
// Shared memory synchronization pattern
__shared__ float sharedData[256];
sharedData[threadIdx.x] = input[threadIdx.x]; // Write phase
barrier.Sync(); // Ensure all writes complete
float result = sharedData[neighbor]; // Read phase (safe)
Remarks
This method is called from GPU kernel code to synchronize threads. All threads
in the barrier's scope must call Sync() for any thread to proceed.
Semantics:
- Thread arrives at barrier and increments arrival counter
- Thread blocks until counter reaches Capacity
- All threads released simultaneously when capacity reached
- Counter resets to 0 for next barrier cycle
Memory Ordering: Barriers provide acquire-release semantics:
- All memory operations before sync() are visible to all threads after sync()
- Shared memory reads after sync() see all writes before sync()
- Equivalent to memory fence + synchronization
Deadlock Prevention:
- Ensure all threads execute barrier call (no conditional skipping)
- Match barrier calls across all divergent control paths
- Use same BarrierId for all threads in scope
Performance:
- ThreadBlock: ~10ns latency
- Grid: ~1-10μs latency
- Warp: ~1ns latency (lockstep execution)
Exceptions
- InvalidOperationException
Thrown when:
- Barrier has been disposed
- Thread is not part of the barrier's scope
- Grid barrier used without cooperative launch