Bases: PretrainedConfig
Source code in vllm/transformers_utils/configs/mlp_speculator.py
  
 __init__(
    vocab_size: int = 32000,
    emb_dim: int = 4096,
    inner_dim: int = 0,
    n_predict: int = 3,
    top_k_tokens_per_head: list[int] | None = None,
    n_candidates: int = 5,
    tie_weights: bool = False,
    scale_input: bool = False,
    **kwargs,
)
Initialize an MLPSpeculatorConfig
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
| vocab_size | int | int the model vocab size | 32000 | 
| emb_dim | int | int the model embedding dimension | 4096 | 
| inner_dim | int | int the inner dimension of the model. If 0, will be the emb_dim. | 0 | 
| n_predict | int | int the number of lookaheads for the speculator | 3 | 
| top_k_tokens_per_head | list[int] | None | list[int] Number of tokens to consider from each head when forming the candidate tree. For each candidate branch in the tree, head n produces topk[n] additional sub-branches. NOTE: This parameter is currently unused. | None | 
| n_candidates | int | int number of child candidates to create per sequence | 5 | 
| tie_weights | bool | bool If true, use a single set of weights for every model head/stage after the first. The initial projection from the base model may have a different size, so that stays separate. | False | 
| scale_input | bool | bool if True, will scale the initial hidden states from the base model. | False |