Namespace DotCompute.Backends.CUDA.Memory

Classes

CudaAsyncMemoryManagerAdapter: Adapter that wraps CudaMemoryManager for async operations. Bridges the CUDA memory manager with the unified memory interface.

CudaContextMemoryManager: CUDA context-specific memory manager wrapping CudaMemoryManager

CudaMemoryBuffer: Represents a CUDA memory buffer allocated on the GPU device.

CudaMemoryBuffer<T>: Represents a generic CUDA memory buffer allocated on the GPU device.

CudaMemoryManager: High-performance CUDA device memory manager with automatic pooling and unified memory support.

CudaMemoryOrderingProvider: CUDA-specific implementation of memory ordering primitives.

CudaMemoryPoolManager: Manages memory pools for efficient allocation and reuse of CUDA memory. Reduces allocation overhead and memory fragmentation.

CudaMemoryPrefetcher: Manages memory prefetching for unified memory to optimize data movement. Uses cudaMemPrefetchAsync to proactively move data between host and device.

CudaPinnedMemoryAllocator: Manages pinned (page-locked) host memory for high-bandwidth transfers. Pinned memory provides up to 10x bandwidth improvement (20GB/s vs 2GB/s).

CudaRawMemoryBuffer: Raw untyped CUDA memory buffer for byte-level operations.

MemoryPoolStatistics: Statistics for memory pool usage.

OptimizedCudaMemoryPrefetcher

Advanced CUDA memory prefetcher with intelligent pattern recognition:

Predictive prefetching based on access patterns
Multi-level prefetch strategies (L1, L2, global memory)
Adaptive prefetch distance based on bandwidth utilization
NUMA-aware prefetching for multi-GPU systems
Asynchronous prefetch operations with minimal overhead
Cache pollution avoidance with smart eviction policies Target: 30-50% improvement in memory-bound kernel performance

PinnedMemoryStatistics: Statistics for pinned memory usage.

PoolSizeStatistics: Statistics for a specific pool size.

PrefetchRequest: Request for batch prefetch operation.

PrefetchStatistics: Statistics for prefetch operations.

PrefetcherConfiguration: Configuration for the memory prefetcher.

SimpleCudaUnifiedMemoryBuffer<T>: Simple CUDA unified memory buffer implementation for the memory adapter. This is a lightweight version that doesn't depend on CudaUnifiedMemoryManagerProduction.

Structs

PrefetcherStatistics: Performance statistics for the prefetcher.

Interfaces

IPinnedMemoryBuffer<T>: Interface for pinned memory buffers.

IPinnedMemoryRegistration: Interface for pinned memory registration.

IPooledMemoryBuffer: Interface for pooled memory buffers.

Enums

CacheLevel: An cache level enumeration.

CudaHostAllocFlags: Flags for pinned memory allocation.

CudaHostRegisterFlags: Flags for host memory registration.

MemoryAccessHint: An memory access hint enumeration.

MemoryAccessType: An memory access type enumeration.

PrefetchPriority: An prefetch priority enumeration.

PrefetchStrategy: An prefetch strategy enumeration.

PrefetchTarget: Target location for prefetch operation.