Skip to content

[Excutorch][Llama] Decouple input sequence length from kv cache context length #30534

[Excutorch][Llama] Decouple input sequence length from kv cache context length

[Excutorch][Llama] Decouple input sequence length from kv cache context length #30534

Job Run time
5s
19m 4s
7m 56s
8m 14s
8m 32s
9m 8s
8m 41s
10m 13s
10m 14s
4m 43s
4m 56s
5m 56s
10m 15s
6m 33s
8m 34s
1h 8m 30s
7m 40s
7m 25s
50m 3s
5m 47s
6m 10s
6m 12s
27m 52s
4m 52s
26m 52s
4m 30s
20m 1s
15m 10s
6m 14s
7m 11s
7m 9s
6m 40s
7m 58s
8m 15s
31m 2s
2m 37s
7h 31m 14s