Examples
Practical examples demonstrating DotCompute features and patterns.
Getting Started
New to DotCompute? Start with these examples:
- Working Reference - Tested, working code patterns for v0.5.0
- Vector Addition - Classic "Hello World" for GPU computing
- Matrix Multiplication - 2D grid operations
Available Examples
Basic Operations
- Vector Operations
- Vector addition (element-wise)
- Scalar multiplication
- Dot product (reduction)
- Vector normalization
Matrix Operations
- Matrix Operations
- Matrix multiplication
- Matrix transpose
- Matrix inversion
- Eigenvalues and eigenvectors
Image Processing
- Image Processing
- Gaussian blur
- Edge detection (Sobel, Canny)
- Color space conversion
- Image resizing
Pipeline Patterns
- Multi-Kernel Pipelines
- Chaining multiple kernels
- Data dependencies
- Pipeline optimization
- Asynchronous execution
Quick Reference
Basic Kernel Template
using DotCompute;
[Kernel]
public static void MyKernel(
ReadOnlySpan<float> input,
Span<float> output)
{
int idx = Kernel.ThreadId.X;
if (idx < output.Length)
{
output[idx] = input[idx] * 2.0f;
}
}
Execution Pattern
using DotCompute;
using DotCompute.Runtime;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
var host = Host.CreateDefaultBuilder(args)
.ConfigureServices(services =>
{
services.AddDotComputeRuntime();
})
.Build();
var orchestrator = host.Services.GetRequiredService<IComputeOrchestrator>();
var input = new float[] { 1, 2, 3, 4, 5 };
var output = new float[5];
await orchestrator.ExecuteKernelAsync(
"MyKernel",
new object[] { input, output });
Common Patterns
Element-wise operation:
int idx = Kernel.ThreadId.X;
if (idx < length)
{
output[idx] = operation(input[idx]);
}
Reduction:
int idx = Kernel.ThreadId.X;
float sum = 0.0f;
for (int i = idx; i < length; i += Kernel.BlockDim.X)
{
sum += input[i];
}
atomicAdd(ref output[0], sum);
2D Grid:
int x = Kernel.ThreadId.X;
int y = Kernel.ThreadId.Y;
if (x < width && y < height)
{
int idx = y * width + x;
output[idx] = input[idx];
}
Performance Expectations
Vector Operations (1M elements)
| Operation | CPU (SIMD) | CUDA RTX 2000 Ada |
|---|---|---|
| Addition | 0.8ms | 0.04ms |
| Multiplication | 0.8ms | 0.04ms |
| Dot Product | 1.2ms | 0.05ms |
Matrix Operations
| Operation | Size | CPU | CUDA |
|---|---|---|---|
| MatMul | 1024x1024 | 45ms | 2.1ms |
| Transpose | 4096x4096 | 18ms | 0.9ms |
Image Processing
| Operation | Resolution | CPU | CUDA |
|---|---|---|---|
| Gaussian Blur (5x5) | 1920x1080 | 28ms | 1.8ms |
| Sobel Edge | 1920x1080 | 35ms | 2.2ms |
Note: Performance measured on:
- CPU: AMD Ryzen with AVX2 SIMD
- CUDA: NVIDIA RTX 2000 Ada (CC 8.9)
Related Documentation
- Getting Started - Installation and setup
- Kernel Development Guide - Writing efficient kernels
- Performance Tuning - Optimization techniques
- Debugging Guide - Troubleshooting
- Learning Paths - Structured learning by experience level
Examples - Patterns - Best Practices - Production Ready