module-attribute  ¶
 FUSED_OPS: dict[FusedRMSQuantKey, OpOverload] = {
    FusedRMSQuantKey(kFp8StaticTensorSym, False): default,
    FusedRMSQuantKey(kFp8StaticTensorSym, True): default,
    FusedRMSQuantKey(kFp8DynamicTokenSym, False): default,
    FusedRMSQuantKey(kFp8DynamicTokenSym, True): default,
}
 module-attribute  ¶
 QUANT_OPS: dict[QuantKey, OpOverload] = {
    kFp8StaticTensorSym: default,
    kFp8DynamicTensorSym: default,
    kFp8DynamicTokenSym: default,
}
 
  Bases: RMSNormQuantPattern
Source code in vllm/compilation/fusion.py
  
 __init__(
    epsilon: float,
    quant_dtype: dtype,
    group_shape: GroupShape = PER_TOKEN,
    symmetric=True,
)
Source code in vllm/compilation/fusion.py
  
  Source code in vllm/compilation/fusion.py
  
  Bases: RMSNormQuantPattern
Source code in vllm/compilation/fusion.py
  
  Source code in vllm/compilation/fusion.py
   
  Source code in vllm/compilation/fusion.py
  
  Bases: NamedTuple
Named tuple for identifying the type of RMSNorm + quant fusion. quant: type of quantization fused_add: does the op also perform the residual add
Source code in vllm/compilation/fusion.py
  
  Bases: RMSNormQuantPattern
Source code in vllm/compilation/fusion.py
  
 __init__(
    epsilon: float,
    quant_dtype: dtype,
    group_shape: GroupShape = PER_TOKEN,
    symmetric=True,
)
Source code in vllm/compilation/fusion.py
  
  Source code in vllm/compilation/fusion.py
  
  Bases: VllmPatternMatcherPass
This pass fuses rms_norm & quant custom ops into a fused rms_norm_quant op. It also supports fused_add_rms_norm.
Source code in vllm/compilation/fusion.py
  instance-attribute  ¶
   
 __init__(config: VllmConfig)
Source code in vllm/compilation/fusion.py
  
 Source code in vllm/compilation/fusion.py
  instance-attribute  ¶
 rmsnorm_matcher = (
    MatcherRMSNorm(epsilon)
    if not fused_add
    else MatcherFusedAddRMSNorm(epsilon)
)
 
 __init__(epsilon: float, key: FusedRMSQuantKey)
Source code in vllm/compilation/fusion.py
  
  Bases: RMSNormQuantPattern