Enum MemoryConsistencyModel

Namespace: DotCompute.Abstractions.Memory

Assembly: DotCompute.Abstractions.dll

Defines the memory consistency model for GPU kernel execution.

public enum MemoryConsistencyModel

Fields

Relaxed = 0

Relaxed memory consistency: no ordering guarantees between threads.

In the relaxed model, threads may observe memory operations in any order unless explicitly synchronized with fences or atomic operations. This is the default GPU memory model.

Example:

Thread 1: A = 1; B = 2;
Thread 2: r1 = B; r2 = A;  // May see r1=2, r2=0 (reordering)

Performance: 1.0× baseline (no overhead).

Use When: Data-parallel algorithms with independent operations, no inter-thread communication, or manual fence management.

ReleaseAcquire = 1

Release-Acquire memory consistency: causal ordering for synchronized operations.

Release-Acquire semantics ensure that:

Release Store: All prior writes become visible before the store
Acquire Load: All subsequent reads see values after the load
Causality: If Thread A releases X and Thread B acquires X, all of A's prior writes are visible to B

Example:

Thread 1: A = 1; B = 2; release_store(&flag, 1);  // Release
Thread 2: if (acquire_load(&flag)) r1 = A;       // Acquire, sees A=1

Implementation:

Release: Fence before atomic store
Acquire: Fence after atomic load

Performance: 0.85× baseline (15% overhead from fences).

Use When: Producer-consumer patterns, message passing, distributed data structures, actor systems (Orleans.GpuBridge.Core).

Sequential = 2

Sequential consistency: total order visible to all threads.

Sequential consistency (SC) provides the strongest guarantee: all threads observe memory operations in the same global order, as if operations were interleaved on a single processor.

Example:

Thread 1: A = 1; B = 2;
Thread 2: r1 = B; r2 = A;
// SC guarantees: if r1=2, then r2=1 (never r2=0)

Implementation: Fence before and after every memory operation.

Performance: 0.60× baseline (40% overhead from pervasive fencing).

Use When: Algorithm correctness requires total order visibility, performance is secondary, or debugging relaxed-model race conditions.

⚠️ Warning: Sequential consistency significantly impacts performance. Only use when absolutely necessary. Consider Release-Acquire first.

Remarks

Memory consistency models specify the ordering guarantees for memory operations performed by different threads. Stronger models provide more intuitive semantics but impose higher performance costs.

Model Comparison:

Relaxed	No ordering guarantees. Threads may observe operations in any order. Maximum performance (1.0× baseline), minimal synchronization.
ReleaseAcquire	Causal ordering: release stores become visible to acquire loads. Good balance (0.85× baseline, 15% overhead), suitable for most algorithms.
Sequential	Total order: all threads observe operations in the same order. Strongest guarantees (0.60× baseline, 40% overhead), simplest reasoning.

Choosing a Model:

Relaxed: Data-parallel algorithms with no inter-thread dependencies
ReleaseAcquire: Producer-consumer patterns, message passing (recommended default)
Sequential: Complex algorithms requiring total order visibility

Table of Contents

Enum MemoryConsistencyModel

Fields

Remarks