Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Architecture Overview

RustKernels is a modular, high-performance GPU kernel library for financial and enterprise workloads. This document describes the system architecture and key design decisions.

System Design

┌─────────────────────────────────────────────────────────────────┐
│                       rustkernels (facade)                       │
│                    Re-exports all domain crates                  │
└─────────────────────────────────────────────────────────────────┘
                                  │
          ┌───────────────────────┼───────────────────────┐
          │                       │                       │
          ▼                       ▼                       ▼
┌─────────────────┐   ┌─────────────────┐   ┌─────────────────┐
│ rustkernel-core │   │rustkernel-derive│   │  rustkernel-cli │
│                 │   │                 │   │                 │
│ - Traits        │   │ - #[gpu_kernel] │   │ - CLI tool      │
│ - Registry      │   │ - #[derive(...)]│   │ - Management    │
│ - K2K messaging │   │                 │   │                 │
│ - Enterprise    │   │                 │   │                 │
│   modules       │   │                 │   │                 │
└─────────────────┘   └─────────────────┘   └─────────────────┘
          │
          ├──────────────────────────────────────┐
          │                                      │
          ▼                                      ▼
┌───────────────────────────────────┐   ┌─────────────────┐
│          14 Domain Crates         │   │   rustkernel-   │
│                                   │   │   ecosystem     │
│  graph │ ml │ compliance │ risk  │   │                 │
│  temporal │ banking │ procint    │   │ - Axum REST     │
│  behavioral │ orderbook │ ...   │   │ - Tower         │
│                                   │   │ - Tonic gRPC    │
│  Each implements domain-specific  │   │ - Actix actors  │
│  kernels using core traits        │   │                 │
└───────────────────────────────────┘   └─────────────────┘
          │
          ▼
┌─────────────────────────────────────────────────────────────────┐
│                    RingKernel 0.4.2 (crates.io)                  │
│          GPU-native persistent actor runtime framework           │
└─────────────────────────────────────────────────────────────────┘

Workspace Structure

The workspace contains 19 crates organized by concern:

Infrastructure Crates

CratePurpose
rustkernelsFacade crate — re-exports all domains
rustkernel-coreCore traits, registry, licensing, K2K coordination, enterprise modules
rustkernel-deriveProcedural macros for kernel definition
rustkernel-ecosystemService integrations (Axum, Tower, Tonic, Actix)
rustkernel-cliCommand-line interface for kernel management

Domain Crates

14 domain-specific crates, each containing kernels for a particular business area:

crates/
├── rustkernel-graph/        # Graph analytics (28 kernels)
├── rustkernel-ml/           # Statistical ML (17 kernels)
├── rustkernel-compliance/   # AML/KYC (11 kernels)
├── rustkernel-temporal/     # Time series (7 kernels)
├── rustkernel-risk/         # Risk analytics (5 kernels)
├── rustkernel-banking/      # Banking (1 kernel)
├── rustkernel-behavioral/   # Behavioral (6 kernels)
├── rustkernel-orderbook/    # Order matching (1 kernel)
├── rustkernel-procint/      # Process intelligence (7 kernels)
├── rustkernel-clearing/     # Clearing/settlement (5 kernels)
├── rustkernel-treasury/     # Treasury (5 kernels)
├── rustkernel-accounting/   # Accounting (9 kernels)
├── rustkernel-payments/     # Payments (2 kernels)
└── rustkernel-audit/        # Audit (2 kernels)

Core Traits

All kernels are built on a set of core traits defined in rustkernel-core:

GpuKernel

The base trait for all kernels:

pub trait GpuKernel: Send + Sync + Debug {
    /// Returns kernel metadata (ID, domain, mode, performance targets)
    fn metadata(&self) -> &KernelMetadata;

    /// Validates kernel configuration
    fn validate(&self) -> Result<()>;

    /// Health check (enterprise)
    fn health_check(&self) -> HealthStatus { HealthStatus::Healthy }

    /// Graceful shutdown
    async fn shutdown(&self) -> Result<()> { Ok(()) }

    /// Hot-reload configuration
    fn refresh_config(&mut self, config: &KernelConfig) -> Result<()> { Ok(()) }
}

BatchKernel

For CPU-orchestrated batch execution:

pub trait BatchKernel<I, O>: GpuKernel {
    /// Execute the kernel with typed input
    async fn execute(&self, input: I) -> Result<O>;

    /// Execute with auth, tenant, and tracing context
    async fn execute_with_context(&self, ctx: &ExecutionContext, input: I) -> Result<O>;

    /// Validate input before execution
    fn validate_input(&self, input: &I) -> Result<()> { Ok(()) }
}

BatchKernelDyn and TypeErasedBatchKernel

For type-erased execution via REST/gRPC:

/// Dynamic dispatch trait — JSON bytes in, JSON bytes out
pub trait BatchKernelDyn: GpuKernel {
    async fn execute_dyn(&self, input: &[u8]) -> Result<Vec<u8>>;
}

/// Bridges typed BatchKernel<I,O> to BatchKernelDyn via JSON serialization
pub struct TypeErasedBatchKernel<K, I, O> { /* ... */ }

Kernels registered via register_batch_typed() are automatically wrapped in TypeErasedBatchKernel, enabling execution through the ecosystem service layer without compile-time knowledge of input and output types.

RingKernelHandler

For GPU-persistent actor execution:

pub trait RingKernelHandler<M, R>: GpuKernel
where
    M: RingMessage,
    R: RingMessage,
{
    /// Handle a message and produce a response
    async fn handle(&self, ctx: &mut RingContext, msg: M) -> Result<R>;

    /// Handle with security context
    async fn handle_secure(&self, ctx: &mut SecureRingContext, msg: M) -> Result<R>;
}

IterativeKernel

For multi-pass algorithms (PageRank, K-Means, etc.):

pub trait IterativeKernel<S, I, O>: GpuKernel {
    /// Create initial state from input
    fn initial_state(&self, input: &I) -> S;

    /// Perform one iteration
    async fn iterate(&self, state: &mut S, input: &I) -> Result<IterationResult<O>>;

    /// Check convergence
    fn converged(&self, state: &S, threshold: f64) -> bool;
}

Additional Traits

TraitPurpose
CheckpointableKernelSave/restore kernel state for recovery
DegradableKernelGraceful degradation under pressure

Kernel Registration

The KernelRegistry provides three registration methods:

MethodUse Case
register_batch_typed(factory)Kernels with BatchKernel<I, O> — full execution support via REST/gRPC
register_batch_metadata_from(factory)Batch kernels with GpuKernel only — metadata and discovery
register_ring_metadata_from(factory)Ring kernels — metadata only (require Ring runtime for execution)

Example:

pub fn register_all(registry: &KernelRegistry) -> Result<()> {
    // Full execution support — callable via REST/gRPC
    registry.register_batch_typed(BetweennessCentrality::new)?;

    // Metadata-only — discoverable but not directly executable via REST
    registry.register_batch_metadata_from(GraphDensity::new)?;

    // Ring kernel — requires RingKernel runtime
    registry.register_ring_metadata_from(PageRankRing::new)?;

    Ok(())
}

Kernel Metadata

Every kernel carries associated metadata:

pub struct KernelMetadata {
    pub id: String,                  // e.g., "graph/pagerank"
    pub mode: KernelMode,           // Batch or Ring
    pub domain: Domain,             // Business domain
    pub description: String,        // Human-readable description
    pub expected_throughput: u64,    // Operations per second
    pub target_latency_us: f64,     // Target latency in microseconds
    pub requires_gpu_native: bool,  // GPU-only or CPU fallback available
    pub version: u32,               // Kernel implementation version
}

K2K (Kernel-to-Kernel) Messaging

Cross-kernel coordination patterns for complex multi-stage computations:

PatternUse Case
IterativeStateTrack convergence across iterations
ScatterGatherStateParallel worker coordination
FanOutTrackerBroadcast patterns
PipelineTrackerMulti-stage processing

Example: Iterative Coordination

use rustkernel_core::k2k::IterativeState;

let mut state = IterativeState::new(max_iterations);

while !state.converged() {
    let results = execute_iteration(&mut state).await?;
    state.update(results.delta);
}

Domain Crate Structure

Each domain crate follows a consistent structure:

rustkernel-{domain}/
├── Cargo.toml
└── src/
    ├── lib.rs           # Module exports, register_all()
    ├── messages.rs      # Batch kernel input/output types
    ├── ring_messages.rs # Ring message types with #[derive(RingMessage)]
    ├── types.rs         # Common domain types
    └── {feature}.rs     # Kernel implementations

Example: Graph Analytics Crate

rustkernel-graph/
└── src/
    ├── lib.rs
    ├── messages.rs
    ├── ring_messages.rs
    ├── types.rs
    ├── centrality.rs    # PageRank, Betweenness, Closeness, etc.
    ├── community.rs     # Louvain, Label Propagation
    ├── similarity.rs    # Jaccard, Cosine, Adamic-Adar
    ├── metrics.rs       # Density, Clustering Coefficient
    ├── motif.rs         # Triangle counting, k-cliques
    ├── topology.rs      # Connected components, cycles, paths
    └── gnn.rs           # GNN inference, graph attention

Ring Message Type IDs

Each domain has a reserved range for Ring message type IDs, aligned with ringkernel_core::domain::Domain base offsets (0.4.2):

DomainRangeRingKernel Domain
Graph Analytics100–199GraphAnalytics
Statistical ML200–299StatisticalML
Compliance300–399Compliance
Risk Analytics400–499RiskManagement
Temporal Analysis500–599TimeSeries
Order Matching600–699OrderMatching
Clearing700–799Clearing

RingKernel 0.4.2 Integration

RustKernels 0.4.0 deeply integrates with RingKernel 0.4.2:

Domain Conversion

Bidirectional conversion between RustKernels and RingKernel domain types:

use rustkernel_core::domain::Domain;

let domain = Domain::TemporalAnalysis;
let ring_domain = domain.to_ring_domain();  // → ringkernel_core::domain::Domain::TimeSeries
let back = Domain::from_ring_domain(ring_domain);  // → Domain::TemporalAnalysis

Re-exports from RingKernel

TypeDescription
ControlBlockGPU control block for persistent kernel state
BackendRuntime backend selection (CUDA, CPU, WebGPU)
KernelStatusDetailed kernel status information
RuntimeMetricsRuntime performance metrics
K2KConfigKernel-to-kernel messaging configuration
PriorityMessage priority levels

Submodule Re-exports

ModuleDescription
rustkernel_core::checkpointKernel checkpointing and recovery
rustkernel_core::dispatcherMessage dispatching
rustkernel_core::healthHealth checking (circuit breaker, degradation)
rustkernel_core::pubsubPub/sub messaging patterns

Licensing System

Enterprise licensing in rustkernel-core/src/license.rs:

  • DevelopmentLicense: All features enabled (default for local development)
  • ProductionLicense: Domain-based feature gating
  • Validation occurs at kernel registration and activation time
use rustkernel_core::license::{LicenseValidator, DevelopmentLicense};

let validator = DevelopmentLicense::new();
assert!(validator.is_domain_licensed(Domain::GraphAnalytics));

Fixed-Point Arithmetic

For GPU-compatible exact financial calculations, Ring messages use fixed-point arithmetic:

// 8 decimal places (standard kernels)
fn to_fixed_point(value: f64) -> i64 { (value * 100_000_000.0) as i64 }
fn from_fixed_point(fp: i64) -> f64 { fp as f64 / 100_000_000.0 }

// 18 decimal places (accounting kernels)
const SCALE: i128 = 1_000_000_000_000_000_000;

pub struct FixedPoint128 {
    pub value: i128,
}

Next Steps