[Excutorch][Llama] Decouple input sequence length from kv cache context length #30534
Triggered via pull request
January 29, 2025 16:56
Status
Success
Total duration
1h 11m 41s
Artifacts
–
pull.yml
on: pull_request
Matrix: test-llama-runner-linux
gather-models
5s
unittest
/
...
/
linux-job
20m 1s
unittest
/
...
/
macos-job
15m 10s
unittest-arm
/
linux-job
19m 4s
test-binary-size-linux
/
job
test-binary-size-linux-gcc
/
job
test-custom-ops-linux
/
job
test-eval_llama-mmlu-linux
/
job
test-eval_llama-wikitext-linux
/
job
test-llama-runner-linux-android
/
job
test-llama_runner_eager-linux
/
job
test-llava-runner-linux
/
job
test-mediatek-models-linux
/
job
test-phi-3-mini-runner-linux
/
job
test-pybind-build-linux
/
job
test-qnn-models-linux
/
job
test-quantized-aot-lib-linux
/
job
test-selective-build-linux
/
job
test-setup-linux-gcc
/
job
Matrix: test-llama-runner-qnn-linux
Matrix: test-models-linux
android
/
run-emulator
2m 37s