Class MemoryConfiguration
- Namespace
- DotCompute.Backends.OpenCL.Configuration
- Assembly
- DotCompute.Backends.OpenCL.dll
Configuration for OpenCL memory management, buffer pooling, and peer-to-peer transfers.
public sealed class MemoryConfiguration
- Inheritance
-
MemoryConfiguration
- Inherited Members
Remarks
This configuration controls memory allocation strategies, buffer pooling behavior, and advanced features like pinned memory and peer-to-peer (P2P) transfers across multiple OpenCL devices.
Memory management is critical for GPU compute performance:
- Buffer pooling reduces allocation overhead (90%+ reduction possible)
- Proper alignment ensures coalesced memory access
- Pinned memory accelerates CPU-GPU transfers
- P2P transfers eliminate host memory copies between GPUs
Properties
AsyncTransferThreshold
Gets the maximum size for asynchronous memory transfers (in bytes).
public int AsyncTransferThreshold { get; init; }
Property Value
- int
The async transfer threshold. Must be greater than 0. Default is 64 KB (65536 bytes).
Remarks
Transfers smaller than this size are performed synchronously to avoid async overhead. Larger transfers use asynchronous operations with events for better pipelining and CPU-GPU overlap. Typical range: 32 KB - 256 KB.
Exceptions
- ArgumentOutOfRangeException
Thrown when the value is less than 1.
BufferAlignment
Gets the memory alignment for buffer allocations (in bytes).
public int BufferAlignment { get; init; }
Property Value
- int
The alignment. Must be a power of 2. Default is 128 bytes.
Remarks
Proper alignment ensures coalesced memory access on GPUs:
- NVIDIA: 128-byte alignment for coalescing
- AMD: 256-byte alignment for optimal access
- Intel: 64-byte alignment (cache line size) This value affects the padding added to buffer allocations.
Exceptions
- ArgumentOutOfRangeException
Thrown when the value is not a power of 2 or less than 16.
BufferIdleTimeout
Gets the idle timeout for pooled buffers.
public TimeSpan BufferIdleTimeout { get; init; }
Property Value
- TimeSpan
The idle timeout. Must be greater than TimeSpan.Zero. Default is 10 minutes.
Remarks
Buffers that have been idle in the pool longer than this duration may be released to free device memory. This prevents memory leaks from pooling while maintaining performance for active workloads.
Exceptions
- ArgumentOutOfRangeException
Thrown when the value is less than or equal to TimeSpan.Zero.
Default
Creates a default memory configuration instance.
public static MemoryConfiguration Default { get; }
Property Value
- MemoryConfiguration
A new MemoryConfiguration instance with default settings.
Development
Creates a memory configuration optimized for development and debugging.
public static MemoryConfiguration Development { get; }
Property Value
- MemoryConfiguration
A new MemoryConfiguration instance optimized for development.
Remarks
This configuration:
- Smaller pool sizes to track allocations
- Statistics enabled for monitoring
- Shorter idle timeouts to detect leaks quickly
- Disabled optimizations for easier debugging
- More conservative settings overall
EnableBufferPooling
Gets a value indicating whether to enable memory buffer pooling.
public bool EnableBufferPooling { get; init; }
Property Value
- bool
trueto enable pooling; otherwise,false. Default istrue.
Remarks
Buffer pooling reuses allocated memory to avoid clCreateBuffer overhead. This can reduce allocation latency by 90%+ and minimize memory fragmentation. Disable only for debugging or when precise memory tracking is required.
EnablePeerToPeerTransfers
Gets a value indicating whether to enable peer-to-peer (P2P) transfers.
public bool EnablePeerToPeerTransfers { get; init; }
Property Value
- bool
trueto enable P2P; otherwise,false. Default istrue.
Remarks
P2P transfers allow direct GPU-to-GPU memory copies without going through host memory. This requires:
- Multiple GPUs in the same system
- Driver support (NVIDIA: NVLink/PCIe P2P, AMD: Infinity Fabric/PCIe P2P)
- OpenCL extension: cl_khr_p2p_buffer When unavailable, transfers automatically fall back to host-mediated copies.
EnableStatistics
Gets a value indicating whether to enable memory usage statistics collection.
public bool EnableStatistics { get; init; }
Property Value
- bool
trueto collect statistics; otherwise,false. Default isfalse.
Remarks
When enabled, tracks:
- Total allocated memory
- Pool hit/miss rates
- Buffer utilization
- Memory fragmentation This adds minimal overhead (~1%) but provides valuable monitoring data. Enable for development and performance tuning; may disable in production.
EnableZeroCopy
Gets a value indicating whether to enable zero-copy buffers where supported.
public bool EnableZeroCopy { get; init; }
Property Value
- bool
trueto enable zero-copy; otherwise,false. Default istrue.
Remarks
Zero-copy buffers (CL_MEM_USE_HOST_PTR) allow the device to directly access host memory without explicit copies. This is beneficial for:
- Integrated GPUs with shared memory (AMD APUs, Intel integrated graphics)
- Buffers that are infrequently accessed by the device
- Write-only or read-only patterns Not recommended for discrete GPUs where PCIe bandwidth limits performance.
LargeBufferThreshold
Gets the size threshold for large memory buffers (in bytes).
public int LargeBufferThreshold { get; init; }
Property Value
- int
The large buffer threshold. Must be greater than MediumBufferThreshold. Default is 1 MB (1048576 bytes).
Remarks
Buffers between medium and large thresholds use the large buffer pool. Buffers larger than this threshold are not pooled to avoid excessive memory consumption. Typical range: 512 KB - 16 MB.
Exceptions
- ArgumentOutOfRangeException
Thrown when the value is less than or equal to MediumBufferThreshold.
LargeBuffers
Creates a memory configuration optimized for large buffer workloads.
public static MemoryConfiguration LargeBuffers { get; }
Property Value
- MemoryConfiguration
A new MemoryConfiguration instance optimized for large buffers.
Remarks
This configuration:
- Uses higher buffer size thresholds
- Larger maximum pool size (2 GB)
- Longer idle timeouts to keep buffers warm
- Aggressive async transfer threshold
- Suitable for workloads with many large allocations (ML training, simulation)
LowLatency
Creates a memory configuration optimized for low-latency workloads.
public static MemoryConfiguration LowLatency { get; }
Property Value
- MemoryConfiguration
A new MemoryConfiguration instance optimized for low latency.
Remarks
This configuration:
- Aggressive pooling with warm caches
- Pinned memory for fast transfers
- Small async threshold for maximum overlap
- Zero-copy for integrated GPUs
- Suitable for interactive applications and streaming workloads
MaximumPoolSize
Gets the maximum total size of pooled memory (in bytes).
public long MaximumPoolSize { get; init; }
Property Value
- long
The maximum pool size. Must be greater than 0. Default is 1 GB (1073741824 bytes).
Remarks
This limits the total amount of memory that can be pooled to prevent over-commitment of device memory. When this limit is reached, older buffers are released to make room for new allocations. Typical values: 25-50% of available device memory.
Exceptions
- ArgumentOutOfRangeException
Thrown when the value is less than 1 MB.
MediumBufferThreshold
Gets the size threshold for medium memory buffers (in bytes).
public int MediumBufferThreshold { get; init; }
Property Value
- int
The medium buffer threshold. Must be greater than SmallBufferThreshold. Default is 64 KB (65536 bytes).
Remarks
Buffers between small and medium thresholds use the medium buffer pool. Common for intermediate computations and temporary storage. Typical range: 32 KB - 1 MB.
Exceptions
- ArgumentOutOfRangeException
Thrown when the value is less than or equal to SmallBufferThreshold.
SmallBufferThreshold
Gets the minimum size threshold for small memory buffers (in bytes).
public int SmallBufferThreshold { get; init; }
Property Value
- int
The small buffer threshold. Must be greater than 0. Default is 4 KB (4096 bytes).
Remarks
Buffers smaller than this size are allocated from the small buffer pool. Common for kernel parameters, indices, and small working sets. Typical range: 1 KB - 64 KB.
Exceptions
- ArgumentOutOfRangeException
Thrown when the value is less than 256.
UsePinnedMemory
Gets a value indicating whether to use pinned (page-locked) host memory.
public bool UsePinnedMemory { get; init; }
Property Value
- bool
trueto use pinned memory; otherwise,false. Default istrue.
Remarks
Pinned memory (CL_MEM_ALLOC_HOST_PTR) is page-locked and can be transferred to/from the device faster than regular memory. However, it's a limited resource and excessive pinning can degrade system performance. Use for frequently transferred buffers in performance-critical paths.