Enum AcceleratorFeature
- Namespace
- DotCompute.Abstractions.Accelerators
- Assembly
- DotCompute.Abstractions.dll
Defines hardware and software features that may be supported by compute accelerators.
[Flags]
public enum AcceleratorFeature
Fields
AtomicOperations = 128Support for atomic operations on global and shared memory.
Essential for implementing thread-safe data structures and algorithms that require synchronization between threads.
Bfloat16 = 256Support for Brain Floating Point 16-bit format (bfloat16).
A 16-bit format that maintains the same exponent range as float32, popular in machine learning for its balance of range and precision.
CooperativeGroups = 32Support for cooperative groups and grid synchronization.
Enables synchronization across multiple thread blocks, allowing more complex parallel algorithms to be implemented.
DoublePrecision = 2Support for 64-bit floating-point (double-precision) operations.
Essential for scientific computing applications requiring high numerical precision.
DynamicParallelism = 64Support for dynamic parallelism (nested kernel launches).
Allows kernels to launch other kernels directly from device code, enabling recursive and adaptive algorithms.
Float16 = 1Support for 16-bit floating-point (half-precision) operations.
This feature enables faster computation for workloads that don't require full precision, such as certain machine learning inference tasks.
LongInteger = 4Support for 64-bit integer operations.
Required for applications working with large integer values or pointers on 64-bit systems.
MixedPrecision = 1024Support for mixed-precision operations within a single kernel.
Allows combining different precision levels in a single computation for optimal performance and accuracy trade-offs.
None = 0No special features are supported.
SignedByte = 512Support for signed 8-bit integer operations.
Enables efficient quantized integer operations, commonly used in optimized neural network inference.
TensorCores = 8Support for Tensor Core operations (NVIDIA) or equivalent matrix acceleration units.
Provides significant acceleration for matrix multiplication and convolution operations, particularly beneficial for deep learning workloads.
UnifiedMemory = 16Support for unified memory between host and device.
Allows automatic memory migration between CPU and GPU, simplifying memory management at the potential cost of performance.
Remarks
This enumeration uses the FlagsAttribute to allow combination of multiple features. Use bitwise operations to check for multiple feature support.