Interface ITimingProvider

Namespace: DotCompute.Abstractions.Timing

Assembly: DotCompute.Abstractions.dll

Provides GPU-native timing capabilities for high-precision temporal measurements.

public interface ITimingProvider

Remarks

The timing provider enables nanosecond-precision timestamp generation directly on GPU hardware, eliminating CPU-GPU round-trip latency. This is critical for applications requiring precise temporal ordering such as physics simulations, real-time systems, and distributed GPU computing.

Platform Support:

CUDA (CC 6.0+): 1ns resolution via %%globaltimer register
CUDA (CC < 6.0): 1μs resolution via CUDA events
OpenCL: 1μs resolution via clock() built-in
CPU: ~100ns resolution via Stopwatch

Usage Example:

var timingProvider = accelerator.GetTimingProvider();
if (timingProvider != null)
{
    var timestamp = await timingProvider.GetGpuTimestampAsync();
    Console.WriteLine($"GPU time: {timestamp}ns");
}

Methods

CalibrateAsync(int, CancellationToken)

Calibrates the GPU clock against the CPU clock to enable accurate time conversions.

Task<ClockCalibration> CalibrateAsync(int sampleCount = 100, CancellationToken ct = default)

Parameters

sampleCount int: Number of CPU-GPU timestamp pairs to collect for calibration (default: 100). Higher values improve accuracy but increase calibration time.
ct CancellationToken: Cancellation token to cancel the async operation.

Returns

Task<ClockCalibration>: A task representing the async operation, containing calibration data including offset, drift rate, and error bounds for converting between CPU and GPU time domains.

Remarks

Clock calibration performs linear regression on sampleCount paired CPU-GPU timestamps to compute:

Offset: GPU_time = CPU_time + offset
Drift: Clock frequency difference (parts per million)
Error Bounds: ±uncertainty range from regression residuals

Performance:

100 samples: ~10ms calibration time
Typical drift: 50-200 PPM (180-720μs/hour)
Recommended recalibration interval: 5-10 minutes

Usage:

var calibration = await timingProvider.CalibrateAsync(sampleCount: 100);
long cpuTime = GetCpuTime();
long gpuTime = calibration.GpuToCpuTime(cpuTime);
var (min, max) = calibration.GetUncertaintyRange(gpuTime);

Exceptions

ArgumentOutOfRangeException: Thrown when sampleCount is less than 10 (insufficient for calibration).
OperationCanceledException: Thrown when the cancellation token is triggered.

EnableTimestampInjection(bool)

Enables automatic timestamp injection at kernel entry points.

void EnableTimestampInjection(bool enable = true)

Parameters

enable bool: True to enable injection, false to disable.

Remarks

When enabled, kernels automatically record a timestamp in parameter slot 0 before executing user code. This eliminates manual timestamp management in kernel code.

Kernel Signature Change:

// Before injection:
__global__ void MyKernel(float* input, float* output);

// After injection (parameter 0 auto-injected):
__global__ void MyKernel(long* timestamps, float* input, float* output);

Overhead: <20ns per kernel launch (timestamp write by thread 0).

Note: Timestamp injection requires kernel recompilation. Existing compiled kernels will not be affected until next compilation.

GetGpuClockFrequency()

Gets the GPU clock frequency in Hertz (cycles per second).

long GetGpuClockFrequency()

Returns

long

The GPU clock frequency in Hz. Typical values:

CUDA: 1,000,000,000 Hz (1 GHz) for nanosecond timers
CUDA Events: 1,000,000 Hz (1 MHz) for microsecond precision
OpenCL: Platform-dependent

Remarks

The clock frequency determines timer resolution. A 1 GHz clock provides 1ns resolution.

GetGpuTimestampAsync(CancellationToken)

Gets the current GPU timestamp in nanoseconds since device initialization.

Task<long> GetGpuTimestampAsync(CancellationToken ct = default)

Parameters

ct CancellationToken: Cancellation token to cancel the async operation.

Returns

Task<long>: A task representing the async operation, containing the GPU timestamp in nanoseconds. The timestamp is monotonically increasing and has device-specific resolution.

Remarks

This method launches a minimal kernel to read the GPU hardware timer. The overhead is typically <10ns on CUDA (CC 6.0+) and <100ns on other platforms.

For batch queries, use GetGpuTimestampsBatchAsync(int, CancellationToken) which amortizes launch overhead across multiple timestamps.

Performance Targets:

CUDA (CC 6.0+): <10ns per query
CUDA Events: <100ns per query
OpenCL/CPU: <1μs per query

Exceptions

OperationCanceledException: Thrown when the cancellation token is triggered.
InvalidOperationException: Thrown when the device is not in a valid state for timestamp queries.

GetGpuTimestampsBatchAsync(int, CancellationToken)

Gets multiple GPU timestamps in a single batch operation for improved efficiency.

Task<long[]> GetGpuTimestampsBatchAsync(int count, CancellationToken ct = default)

Parameters

count int: Number of timestamps to retrieve (must be positive).
ct CancellationToken: Cancellation token to cancel the async operation.

Returns

Task<long[]>: A task representing the async operation, containing an array of GPU timestamps in nanoseconds. All timestamps are captured within a single kernel launch for minimal skew.

Remarks

Batch queries amortize kernel launch overhead across multiple timestamps, achieving <1μs per timestamp when count ≥ 1000.

All timestamps in the batch are captured during the same kernel execution, ensuring minimal temporal skew between samples (typically <100ns).

Performance: For count = 1000:

Total time: ~1μs (1ns per timestamp amortized)
Skew between timestamps: <100ns

Exceptions

ArgumentOutOfRangeException: Thrown when count is less than or equal to zero.
OperationCanceledException: Thrown when the cancellation token is triggered.

GetTimerResolutionNanos()

Gets the timer resolution in nanoseconds (minimum measurable time interval).

long GetTimerResolutionNanos()

Returns

long

The timer resolution in nanoseconds. Typical values:

CUDA (CC 6.0+): 1 ns (%%globaltimer)
CUDA (CC < 6.0): 1,000 ns (CUDA events)
OpenCL: 1,000 ns (clock() built-in)
CPU: ~100 ns (Stopwatch)

Remarks

Lower resolution values indicate higher precision. A 1ns resolution means the timer can distinguish events separated by as little as 1 nanosecond.

Table of Contents

Interface ITimingProvider

Remarks

Methods

CalibrateAsync(int, CancellationToken)

Parameters

Returns

Remarks

Exceptions

EnableTimestampInjection(bool)

Parameters

Remarks

GetGpuClockFrequency()

Returns

Remarks

GetGpuTimestampAsync(CancellationToken)

Parameters

Returns

Remarks

Exceptions

GetGpuTimestampsBatchAsync(int, CancellationToken)

Parameters

Returns

Remarks

Exceptions

GetTimerResolutionNanos()

Returns

Remarks