Interface ICrossGpuBarrier
- Namespace
- DotCompute.Abstractions.Barriers
- Assembly
- DotCompute.Abstractions.dll
Cross-GPU barrier for synchronizing multiple devices in distributed Ring Kernel systems. Enables sub-10μs multi-device coordination with HLC-based temporal ordering.
public interface ICrossGpuBarrier : IDisposable
- Inherited Members
- Extension Methods
Remarks
Cross-GPU barriers provide three synchronization modes:
- P2P Memory Mode: GPU-GPU direct signaling via peer-to-peer memory writes (fastest)
- CUDA Event Mode: Event-based synchronization using cudaEventWaitExternal
- CPU Fallback Mode: Host-mediated synchronization for non-P2P capable systems
Performance Targets:
- P2P Mode: <2μs (direct GPU memory writes)
- Event Mode: <5μs (CUDA event synchronization)
- CPU Mode: <50μs (host-mediated roundtrip)
Integration with HLC:
Each barrier arrival is timestamped with HLC to enable:
- Causal analysis of synchronization patterns
- Distributed debugging of barrier deadlocks
- Timeout detection with happened-before relationships
Properties
BarrierId
Gets the unique identifier for this barrier.
string BarrierId { get; }
Property Value
Mode
Gets the synchronization mode used by this barrier.
CrossGpuBarrierMode Mode { get; }
Property Value
ParticipantCount
Gets the number of participating GPUs in this barrier.
int ParticipantCount { get; }
Property Value
Methods
ArriveAndWaitAsync(int, HlcTimestamp, TimeSpan, CancellationToken)
Arrives at the barrier from the specified GPU and waits for all participants.
Task<CrossGpuBarrierResult> ArriveAndWaitAsync(int gpuId, HlcTimestamp arrivalTimestamp, TimeSpan timeout, CancellationToken cancellationToken = default)
Parameters
gpuIdintGPU device ID (0-based index).
arrivalTimestampHlcTimestampHLC timestamp of arrival for causality tracking.
timeoutTimeSpanMaximum wait time before timing out.
cancellationTokenCancellationTokenCancellation token for aborting wait.
Returns
- Task<CrossGpuBarrierResult>
Barrier result containing:
- Success/timeout/failure status
- Release timestamp (max HLC of all arrivals)
- Arrival timestamps from all participants
Exceptions
- ArgumentOutOfRangeException
Thrown if
gpuIdis invalid.- BarrierTimeoutException
Thrown if barrier times out.
- ObjectDisposedException
Thrown if barrier has been disposed.
ArriveAsync(int, HlcTimestamp)
Arrives at the barrier without waiting (split-phase barrier).
Task<CrossGpuBarrierPhase> ArriveAsync(int gpuId, HlcTimestamp arrivalTimestamp)
Parameters
gpuIdintGPU device ID.
arrivalTimestampHlcTimestampHLC timestamp of arrival.
Returns
- Task<CrossGpuBarrierPhase>
Barrier phase token for later wait operation.
Exceptions
- ArgumentOutOfRangeException
Thrown if
gpuIdis invalid.- ObjectDisposedException
Thrown if barrier has been disposed.
GetStatus()
Gets the current barrier status without blocking.
CrossGpuBarrierStatus GetStatus()
Returns
- CrossGpuBarrierStatus
Status including arrival count and HLC timestamps.
ResetAsync()
Resets the barrier for reuse (increments generation counter).
Task ResetAsync()
Returns
Remarks
All participants must have completed the current barrier phase before resetting. Reset increments the generation counter to prevent ABA problems in wait loops.
Exceptions
- InvalidOperationException
Thrown if barrier is still in use.
- ObjectDisposedException
Thrown if barrier has been disposed.
WaitAsync(CrossGpuBarrierPhase, TimeSpan, CancellationToken)
Waits for barrier completion using a phase token from prior arrival.
Task<CrossGpuBarrierResult> WaitAsync(CrossGpuBarrierPhase phase, TimeSpan timeout, CancellationToken cancellationToken = default)
Parameters
phaseCrossGpuBarrierPhasePhase token from ArriveAsync(int, HlcTimestamp).
timeoutTimeSpanMaximum wait time.
cancellationTokenCancellationTokenCancellation token.
Returns
- Task<CrossGpuBarrierResult>
Barrier result with release timestamp and participant arrivals.
Exceptions
- BarrierTimeoutException
Thrown if barrier times out.
- ObjectDisposedException
Thrown if barrier has been disposed.