Lifecycle Management¶

Understanding kernel states and transitions.

Overview¶

Ring kernels follow a structured lifecycle that enables proper resource management, graceful shutdown, and operational control.

States¶

CREATED¶

The kernel is defined but not yet launched.

@ring_kernel(kernel_id="worker")
async def worker(ctx):
    ...

# Kernel is CREATED but not in runtime

LAUNCHED¶

Resources are allocated, but processing hasn't started.

await runtime.launch("worker")
# State: LAUNCHED
# - Input/output queues created
# - asyncio task not yet running

ACTIVE¶

The kernel is running and processing messages.

await runtime.activate("worker")
# State: ACTIVE
# - asyncio task running
# - ctx.is_active == True
# - Processing messages

DEACTIVATED¶

Processing is paused, but resources remain allocated.

await runtime.deactivate("worker")
# State: DEACTIVATED
# - asyncio task paused
# - ctx.is_active == False
# - Messages still queue

TERMINATING¶

Shutdown has been requested, waiting for graceful exit.

await runtime.terminate("worker")
# State: TERMINATING
# - ctx.should_terminate == True
# - Waiting for actor loop to exit

TERMINATED¶

The kernel has stopped and resources are released.

# After terminate completes
# State: TERMINATED
# - asyncio task done
# - Queues cleared
# - Cannot restart

State Diagram¶

                    ┌──────────────────┐
                    │     CREATED      │
                    └────────┬─────────┘
                             │ launch()
                    ┌────────▼─────────┐
                    │     LAUNCHED     │
                    └────────┬─────────┘
                             │ activate()
                    ┌────────▼─────────┐
         ┌─────────►│      ACTIVE      │◄─────────┐
         │          └────────┬─────────┘          │
         │                   │ deactivate()       │
         │          ┌────────▼─────────┐          │
         │          │   DEACTIVATED    │──────────┘
         │          └────────┬─────────┘ reactivate()
         │                   │
         │ terminate()       │ terminate()
         │          ┌────────▼─────────┐
         └──────────┤   TERMINATING    │◄─── (from any running state)
                    └────────┬─────────┘
                             │ (graceful exit)
                    ┌────────▼─────────┐
                    │    TERMINATED    │
                    └──────────────────┘

Two-Phase Launch¶

The launch → activate pattern provides control over startup:

Phase 1: Launch¶

# Allocate resources for all kernels
await runtime.launch("kernel_a")
await runtime.launch("kernel_b")
await runtime.launch("kernel_c")

# At this point:
# - All queues created
# - All tasks created (but waiting)
# - Memory allocated
# - Kernels compiled (if needed)

Phase 2: Activate¶

# Start processing together
await runtime.activate("kernel_a")
await runtime.activate("kernel_b")
await runtime.activate("kernel_c")

# Now all kernels are processing

Benefits¶

Coordinated Startup: Start multiple kernels simultaneously
Validation: Verify all kernels can launch before activating any
Warm-up: Pre-allocate resources, warm caches
Dependency Ordering: Launch dependencies before dependents

Actor Loop Pattern¶

The standard actor loop handles all lifecycle states:

@ring_kernel(kernel_id="worker")
async def worker(ctx):
    # Optional: One-time initialization
    print(f"[{ctx.kernel_id}] Initializing...")

    # Main processing loop
    while not ctx.should_terminate:
        # Handle deactivation
        if not ctx.is_active:
            print(f"[{ctx.kernel_id}] Paused, waiting...")
            await ctx.wait_active()
            print(f"[{ctx.kernel_id}] Resumed!")
            continue

        try:
            # Process messages with timeout
            msg = await ctx.receive(timeout=0.1)
            result = process(msg)
            await ctx.send(result)

        except asyncio.TimeoutError:
            # No message - loop back to check termination
            continue

    # Optional: Cleanup
    print(f"[{ctx.kernel_id}] Cleaning up...")

Key Points¶

while not ctx.should_terminate: Check for shutdown request
if not ctx.is_active: Handle pause/resume
await ctx.wait_active(): Block until reactivated
timeout=0.1: Allows responsive shutdown
except asyncio.TimeoutError: Timeouts are normal, not errors

Pause and Resume¶

Deactivation pauses processing without losing state:

async with RingKernelRuntime() as runtime:
    await runtime.launch("worker")
    await runtime.activate("worker")

    # Process some messages
    for msg in batch_1:
        await runtime.send("worker", msg)
    for _ in range(len(batch_1)):
        await runtime.receive("worker")

    # Pause for maintenance
    await runtime.deactivate("worker")

    # Messages queue up but aren't processed
    for msg in batch_2:
        await runtime.send("worker", msg)

    # Resume processing
    await runtime.reactivate("worker")

    # Queued messages now processed
    for _ in range(len(batch_2)):
        await runtime.receive("worker")

Use Cases for Deactivation¶

Maintenance: Apply configuration changes
Load Balancing: Temporarily reduce processing
Debugging: Pause to inspect state
Batching: Accumulate messages, then process

Graceful Shutdown¶

Termination allows in-flight work to complete:

# Request termination with timeout
await runtime.terminate("worker", timeout=5.0)

# What happens:
# 1. ctx.should_terminate becomes True
# 2. Actor loop exits on next check
# 3. Runtime waits up to 5 seconds
# 4. If timeout, force stops
# 5. State becomes TERMINATED

Terminating All Kernels¶

# Terminate all at once
await runtime.terminate_all(timeout=10.0)

Using Context Manager¶

The recommended approach for automatic cleanup:

async with RingKernelRuntime() as runtime:
    await runtime.launch("worker")
    await runtime.activate("worker")

    # Use kernel...

# Automatically terminates all on exit

Error Handling¶

Invalid State Transitions¶

from pydotcompute.exceptions import KernelStateError

try:
    await runtime.activate("worker")  # Not launched!
except KernelStateError as e:
    print(f"State: {e.current_state}, need: {e.expected_states}")

Valid Transitions¶

From	Action	To	Valid
CREATED	launch()	LAUNCHED	✓
CREATED	activate()	-	✗
LAUNCHED	activate()	ACTIVE	✓
LAUNCHED	deactivate()	-	✗
ACTIVE	deactivate()	DEACTIVATED	✓
ACTIVE	launch()	-	✗
DEACTIVATED	reactivate()	ACTIVE	✓
DEACTIVATED	activate()	-	✗
TERMINATED	any	-	✗

Monitoring State¶

# Check current state
state = runtime.get_state("worker")
print(f"State: {state.name}")

# List all kernels
for kernel_id in runtime.kernel_ids:
    state = runtime.get_state(kernel_id)
    print(f"{kernel_id}: {state.name}")

# List only active
for kernel_id in runtime.active_kernels:
    print(f"Active: {kernel_id}")

Best Practices¶

Always Use Context Manager: Ensures cleanup
Use Two-Phase Launch: Better control and validation
Implement Graceful Shutdown: Check should_terminate
Handle Deactivation: Support pause/resume
Set Appropriate Timeouts: For responsive shutdown
Log State Changes: For debugging

Next Steps¶

Building Actors Guide: Best practices
Testing Guide: Testing lifecycle