Bases: AttentionBackend
Source code in vllm/v1/attention/backends/short_conv_attn.py
   dataclass  ¶
 Source code in vllm/v1/attention/backends/short_conv_attn.py
  class-attribute instance-attribute  ¶
 token_chunk_offset_ptr: Tensor | None = None
 
 __init__(
    num_prefills: int,
    num_prefill_tokens: int,
    num_decodes: int,
    num_decode_tokens: int,
    query_start_loc: Tensor,
    state_indices_tensor: Tensor,
    has_initial_states_p: Tensor | None,
    nums_dict: dict | None = None,
    batch_ptr: Tensor | None = None,
    token_chunk_offset_ptr: Tensor | None = None,
) -> None
 
  Bases: BaseMambaAttentionMetadataBuilder[ShortConvAttentionMetadata]
Source code in vllm/v1/attention/backends/short_conv_attn.py
  
 build(
    common_prefix_len: int,
    common_attn_metadata: CommonAttentionMetadata,
    fast_build: bool = False,
) -> ShortConvAttentionMetadata