RING_KERNEL_PTX_TEMPLATE

Constant RING_KERNEL_PTX_TEMPLATE 

Source
pub const RING_KERNEL_PTX_TEMPLATE: &str = r#"
.version 8.0
.target sm_75
.address_size 64

.visible .entry ring_kernel_main(
    .param .u64 control_block_ptr,
    .param .u64 input_queue_ptr,
    .param .u64 output_queue_ptr,
    .param .u64 shared_state_ptr
) {
    .reg .u64 %cb_ptr;
    .reg .u32 %one;

    // Load control block pointer
    ld.param.u64 %cb_ptr, [control_block_ptr];

    // Mark as terminated immediately (offset 8)
    mov.u32 %one, 1;
    st.global.u32 [%cb_ptr + 8], %one;

    ret;
}
"#;
Expand description

PTX kernel source template for persistent ring kernel.

This is a minimal kernel that immediately marks itself as terminated. Uses PTX 8.0 / sm_75 as the lowest common denominator that supports cooperative groups. PTX is forward-compatible, so sm_75 PTX runs on sm_89/sm_90/sm_100 and newer GPUs.