Class PinnedStagingBuffer
- Namespace
- DotCompute.Core.Messaging
- Assembly
- DotCompute.Core.dll
Lock-free pinned memory buffer for staging messages before GPU transfer.
public sealed class PinnedStagingBuffer : IDisposable
- Inheritance
-
PinnedStagingBuffer
- Implements
- Inherited Members
- Extension Methods
Remarks
This buffer uses pinned (non-movable) memory to enable zero-copy DMA transfers to GPU via CUDA/OpenCL/Metal. It implements a lock-free multi-producer/single-consumer ring buffer for maximum throughput.
Performance Characteristics: - Enqueue: O(1) amortized, lock-free CAS operations - Dequeue: O(1), single-consumer (pump thread) - Memory: Pinned, non-GC heap (careful with large capacities) - Latency: Sub-microsecond for cache-resident operations
Usage Pattern:
using var buffer = new PinnedStagingBuffer(capacity: 4096, messageSize: 256);
// Producer threads (lock-free)
if (buffer.TryEnqueue(messageBytes))
{
// Message staged successfully
}
// Consumer thread (pump service)
Span<byte> batch = stackalloc byte[batchSize * messageSize];
int count = buffer.DequeueBatch(batch, batchSize);
// Transfer 'batch' to GPU via cuMemcpy/clEnqueueWrite/MTLBuffer.copy
Constructors
PinnedStagingBuffer(int, int)
Initializes a new instance of the PinnedStagingBuffer class.
public PinnedStagingBuffer(int capacity, int messageSize)
Parameters
capacityintMaximum number of messages the buffer can hold (must be power of 2).
messageSizeintFixed size of each message in bytes.
Exceptions
- ArgumentException
Thrown if
capacityis not a power of 2 or less than 2.- ArgumentOutOfRangeException
Thrown if
messageSizeis less than 1.
Properties
BufferPointer
Gets a pointer to the pinned buffer for direct GPU access.
public nint BufferPointer { get; }
Property Value
Remarks
This pointer remains valid for the lifetime of the PinnedStagingBuffer. Use this for zero-copy DMA transfers to GPU via: - CUDA: cuMemcpyHtoD(devicePtr, BufferPointer, size) - OpenCL: clEnqueueWriteBuffer(queue, buffer, CL_TRUE, 0, size, BufferPointer, ...) - Metal: [mtlBuffer contents] = BufferPointer (or use didModifyRange)
Safety: Do not dereference this pointer after Dispose() is called.
Capacity
Gets the maximum number of messages the buffer can hold.
public int Capacity { get; }
Property Value
Count
Gets the current number of messages in the buffer.
public int Count { get; }
Property Value
Remarks
This is an approximate count due to lock-free operations. Use for monitoring only.
IsEmpty
Gets a value indicating whether the buffer is empty.
public bool IsEmpty { get; }
Property Value
IsFull
Gets a value indicating whether the buffer is full.
public bool IsFull { get; }
Property Value
MessageSize
Gets the fixed size of each message in bytes.
public int MessageSize { get; }
Property Value
Methods
Clear()
Clears all messages from the buffer.
public void Clear()
Remarks
This operation is NOT thread-safe with concurrent enqueue/dequeue operations. Use only when the buffer is idle (e.g., during shutdown or reset).
DequeueBatch(Span<byte>, int)
Dequeues a batch of messages from the staging buffer.
public int DequeueBatch(Span<byte> destination, int maxMessages)
Parameters
destinationSpan<byte>The destination buffer to write messages to (must be at least
maxMessages* MessageSize bytes).maxMessagesintMaximum number of messages to dequeue.
Returns
- int
The actual number of messages dequeued (0 if buffer is empty).
Remarks
This method is designed for single-consumer use (the pump thread). It is NOT thread-safe for multiple concurrent consumers.
Batching reduces per-message overhead for GPU transfers. Typical batch sizes: - Low-latency: 1-8 messages (minimize latency) - Balanced: 16-64 messages (balance latency vs throughput) - High-throughput: 128-512 messages (maximize PCIe bandwidth)
Exceptions
- ArgumentException
Thrown if
destinationis too small.- ObjectDisposedException
Thrown if the buffer has been disposed.
Dispose()
Performs application-defined tasks associated with freeing, releasing, or resetting unmanaged resources.
public void Dispose()
~PinnedStagingBuffer()
Finalizer to ensure pinned memory is released.
protected ~PinnedStagingBuffer()
GetBuffer()
Gets a read-only span of the pinned buffer for direct GPU read access.
public ReadOnlySpan<byte> GetBuffer()
Returns
Remarks
Use this for zero-copy reads when the GPU can directly access host pinned memory. The span remains valid until Dispose() is called.
TryEnqueue(ReadOnlySpan<byte>)
Attempts to enqueue a message into the staging buffer.
public bool TryEnqueue(ReadOnlySpan<byte> message)
Parameters
messageReadOnlySpan<byte>The message bytes to enqueue (must be exactly MessageSize bytes).
Returns
Remarks
This method is lock-free and thread-safe for multiple concurrent producers. Uses Compare-And-Swap (CAS) to ensure only one producer claims each slot.
Exceptions
- ArgumentException
Thrown if
messagelength does not match MessageSize.- ObjectDisposedException
Thrown if the buffer has been disposed.