Skip to content

Commit

Permalink
[Llama3.2-11b-vision] Add max_cross_attn_tokens property to vLLM gene…
Browse files Browse the repository at this point in the history
…rator class (tenstorrent#17401)
  • Loading branch information
skhorasganiTT authored and nikileshx committed Feb 3, 2025
1 parent 7750a3b commit 2c5f154
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions models/demos/llama3/tt/generator_vllm.py
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,10 @@ def initialize_vllm_model(cls, hf_config, mesh_device, max_batch_size):
def cache_path(self):
return self.model_args.model_cache_path

@property
def max_cross_attn_tokens(self):
return self.model_args.vision_max_num_chunks * nearest_32(self.model_args.vision_chunk_ntok)

def prefill_forward(
self,
tokens: torch.Tensor,
Expand Down

0 comments on commit 2c5f154

Please sign in to comment.