Skip to content

[Excutorch][Llama] Decouple input sequence length from kv cache context length #7556

[Excutorch][Llama] Decouple input sequence length from kv cache context length

[Excutorch][Llama] Decouple input sequence length from kv cache context length #7556

Job Run time
6s
20m 40s
26m 43s
51m 8s
10m 5s
7m 3s
15m 23s
6m 19s
10m 17s
48m 20s
13m 46s
11m 57s
12m 56s
13m 3s
12m 3s
12m 36s
12m 54s
47m 20s
32m 17s
9m 23s
9m 57s
9m 39s
9m 33s
8m 40s
8m 28s
9m 13s
13m 10s
10m 31s
11m 14s
11m 3s
7h 55m 47s