Table of Contents

Class MemoryConfiguration

Namespace
DotCompute.Backends.OpenCL.Configuration
Assembly
DotCompute.Backends.OpenCL.dll

Configuration for OpenCL memory management, buffer pooling, and peer-to-peer transfers.

public sealed class MemoryConfiguration
Inheritance
MemoryConfiguration
Inherited Members

Remarks

This configuration controls memory allocation strategies, buffer pooling behavior, and advanced features like pinned memory and peer-to-peer (P2P) transfers across multiple OpenCL devices.

Memory management is critical for GPU compute performance:

  • Buffer pooling reduces allocation overhead (90%+ reduction possible)
  • Proper alignment ensures coalesced memory access
  • Pinned memory accelerates CPU-GPU transfers
  • P2P transfers eliminate host memory copies between GPUs

Properties

AsyncTransferThreshold

Gets the maximum size for asynchronous memory transfers (in bytes).

public int AsyncTransferThreshold { get; init; }

Property Value

int

The async transfer threshold. Must be greater than 0. Default is 64 KB (65536 bytes).

Remarks

Transfers smaller than this size are performed synchronously to avoid async overhead. Larger transfers use asynchronous operations with events for better pipelining and CPU-GPU overlap. Typical range: 32 KB - 256 KB.

Exceptions

ArgumentOutOfRangeException

Thrown when the value is less than 1.

BufferAlignment

Gets the memory alignment for buffer allocations (in bytes).

public int BufferAlignment { get; init; }

Property Value

int

The alignment. Must be a power of 2. Default is 128 bytes.

Remarks

Proper alignment ensures coalesced memory access on GPUs:

  • NVIDIA: 128-byte alignment for coalescing
  • AMD: 256-byte alignment for optimal access
  • Intel: 64-byte alignment (cache line size) This value affects the padding added to buffer allocations.

Exceptions

ArgumentOutOfRangeException

Thrown when the value is not a power of 2 or less than 16.

BufferIdleTimeout

Gets the idle timeout for pooled buffers.

public TimeSpan BufferIdleTimeout { get; init; }

Property Value

TimeSpan

The idle timeout. Must be greater than TimeSpan.Zero. Default is 10 minutes.

Remarks

Buffers that have been idle in the pool longer than this duration may be released to free device memory. This prevents memory leaks from pooling while maintaining performance for active workloads.

Exceptions

ArgumentOutOfRangeException

Thrown when the value is less than or equal to TimeSpan.Zero.

Default

Creates a default memory configuration instance.

public static MemoryConfiguration Default { get; }

Property Value

MemoryConfiguration

A new MemoryConfiguration instance with default settings.

Development

Creates a memory configuration optimized for development and debugging.

public static MemoryConfiguration Development { get; }

Property Value

MemoryConfiguration

A new MemoryConfiguration instance optimized for development.

Remarks

This configuration:

  • Smaller pool sizes to track allocations
  • Statistics enabled for monitoring
  • Shorter idle timeouts to detect leaks quickly
  • Disabled optimizations for easier debugging
  • More conservative settings overall

EnableBufferPooling

Gets a value indicating whether to enable memory buffer pooling.

public bool EnableBufferPooling { get; init; }

Property Value

bool

true to enable pooling; otherwise, false. Default is true.

Remarks

Buffer pooling reuses allocated memory to avoid clCreateBuffer overhead. This can reduce allocation latency by 90%+ and minimize memory fragmentation. Disable only for debugging or when precise memory tracking is required.

EnablePeerToPeerTransfers

Gets a value indicating whether to enable peer-to-peer (P2P) transfers.

public bool EnablePeerToPeerTransfers { get; init; }

Property Value

bool

true to enable P2P; otherwise, false. Default is true.

Remarks

P2P transfers allow direct GPU-to-GPU memory copies without going through host memory. This requires:

  • Multiple GPUs in the same system
  • Driver support (NVIDIA: NVLink/PCIe P2P, AMD: Infinity Fabric/PCIe P2P)
  • OpenCL extension: cl_khr_p2p_buffer When unavailable, transfers automatically fall back to host-mediated copies.

EnableStatistics

Gets a value indicating whether to enable memory usage statistics collection.

public bool EnableStatistics { get; init; }

Property Value

bool

true to collect statistics; otherwise, false. Default is false.

Remarks

When enabled, tracks:

  • Total allocated memory
  • Pool hit/miss rates
  • Buffer utilization
  • Memory fragmentation This adds minimal overhead (~1%) but provides valuable monitoring data. Enable for development and performance tuning; may disable in production.

EnableZeroCopy

Gets a value indicating whether to enable zero-copy buffers where supported.

public bool EnableZeroCopy { get; init; }

Property Value

bool

true to enable zero-copy; otherwise, false. Default is true.

Remarks

Zero-copy buffers (CL_MEM_USE_HOST_PTR) allow the device to directly access host memory without explicit copies. This is beneficial for:

  • Integrated GPUs with shared memory (AMD APUs, Intel integrated graphics)
  • Buffers that are infrequently accessed by the device
  • Write-only or read-only patterns Not recommended for discrete GPUs where PCIe bandwidth limits performance.

LargeBufferThreshold

Gets the size threshold for large memory buffers (in bytes).

public int LargeBufferThreshold { get; init; }

Property Value

int

The large buffer threshold. Must be greater than MediumBufferThreshold. Default is 1 MB (1048576 bytes).

Remarks

Buffers between medium and large thresholds use the large buffer pool. Buffers larger than this threshold are not pooled to avoid excessive memory consumption. Typical range: 512 KB - 16 MB.

Exceptions

ArgumentOutOfRangeException

Thrown when the value is less than or equal to MediumBufferThreshold.

LargeBuffers

Creates a memory configuration optimized for large buffer workloads.

public static MemoryConfiguration LargeBuffers { get; }

Property Value

MemoryConfiguration

A new MemoryConfiguration instance optimized for large buffers.

Remarks

This configuration:

  • Uses higher buffer size thresholds
  • Larger maximum pool size (2 GB)
  • Longer idle timeouts to keep buffers warm
  • Aggressive async transfer threshold
  • Suitable for workloads with many large allocations (ML training, simulation)

LowLatency

Creates a memory configuration optimized for low-latency workloads.

public static MemoryConfiguration LowLatency { get; }

Property Value

MemoryConfiguration

A new MemoryConfiguration instance optimized for low latency.

Remarks

This configuration:

  • Aggressive pooling with warm caches
  • Pinned memory for fast transfers
  • Small async threshold for maximum overlap
  • Zero-copy for integrated GPUs
  • Suitable for interactive applications and streaming workloads

MaximumPoolSize

Gets the maximum total size of pooled memory (in bytes).

public long MaximumPoolSize { get; init; }

Property Value

long

The maximum pool size. Must be greater than 0. Default is 1 GB (1073741824 bytes).

Remarks

This limits the total amount of memory that can be pooled to prevent over-commitment of device memory. When this limit is reached, older buffers are released to make room for new allocations. Typical values: 25-50% of available device memory.

Exceptions

ArgumentOutOfRangeException

Thrown when the value is less than 1 MB.

MediumBufferThreshold

Gets the size threshold for medium memory buffers (in bytes).

public int MediumBufferThreshold { get; init; }

Property Value

int

The medium buffer threshold. Must be greater than SmallBufferThreshold. Default is 64 KB (65536 bytes).

Remarks

Buffers between small and medium thresholds use the medium buffer pool. Common for intermediate computations and temporary storage. Typical range: 32 KB - 1 MB.

Exceptions

ArgumentOutOfRangeException

Thrown when the value is less than or equal to SmallBufferThreshold.

SmallBufferThreshold

Gets the minimum size threshold for small memory buffers (in bytes).

public int SmallBufferThreshold { get; init; }

Property Value

int

The small buffer threshold. Must be greater than 0. Default is 4 KB (4096 bytes).

Remarks

Buffers smaller than this size are allocated from the small buffer pool. Common for kernel parameters, indices, and small working sets. Typical range: 1 KB - 64 KB.

Exceptions

ArgumentOutOfRangeException

Thrown when the value is less than 256.

UsePinnedMemory

Gets a value indicating whether to use pinned (page-locked) host memory.

public bool UsePinnedMemory { get; init; }

Property Value

bool

true to use pinned memory; otherwise, false. Default is true.

Remarks

Pinned memory (CL_MEM_ALLOC_HOST_PTR) is page-locked and can be transferred to/from the device faster than regular memory. However, it's a limited resource and excessive pinning can degrade system performance. Use for frequently transferred buffers in performance-critical paths.