-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Excutorch][Llama] Decouple input sequence length from kv cache context length #7927
base: gh/kimishpatel/152/base
Are you sure you want to change the base?
Conversation
…xt length Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7927
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 1070600 with merge base bdd3d9c (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
…xt length Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/) ghstack-source-id: 262854267 Pull Request resolved: #7927
This pull request was exported from Phabricator. Differential Revision: D68448334 |
…cache context length" Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/) [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D68448334 |
…xt length Pull Request resolved: #7927 Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/) ghstack-source-id: 262945661
…cache context length" Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/) [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D68448334 |
…xt length Pull Request resolved: #7927 Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. ghstack-source-id: 263000137 Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/)
…cache context length" Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/) [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D68448334 |
…xt length Pull Request resolved: #7927 Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. ghstack-source-id: 263237442 Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/)
…cache context length" Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/) [ghstack-poisoned]
…xt length Pull Request resolved: #7927 Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. ghstack-source-id: 263342053 Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/)
This pull request was exported from Phabricator. Differential Revision: D68448334 |
…cache context length" Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/) [ghstack-poisoned]
…xt length Pull Request resolved: #7927 Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. ghstack-source-id: 263366616 Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/)
This pull request was exported from Phabricator. Differential Revision: D68448334 |
…cache context length" Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/) cc mergennachin cccclai helunwencser dvorjackz [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D68448334 |
…xt length Pull Request resolved: #7927 Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. ghstack-source-id: 263491763 Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/)
…cache context length" Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/) cc mergennachin cccclai helunwencser dvorjackz [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D68448334 |
…xt length Pull Request resolved: #7927 Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. ghstack-source-id: 263517976 Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/)
…cache context length" Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/) cc mergennachin cccclai helunwencser dvorjackz [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D68448334 |
…xt length Pull Request resolved: #7927 Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. ghstack-source-id: 263531315 Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/)
…cache context length" Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/) cc mergennachin cccclai helunwencser dvorjackz [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D68448334 |
…xt length Pull Request resolved: #7927 Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. ghstack-source-id: 263580354 Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/)
…cache context length" Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/) cc mergennachin cccclai helunwencser dvorjackz [ghstack-poisoned]
This pull request was exported from Phabricator. Differential Revision: D68448334 |
…xt length Pull Request resolved: #7927 Decouple max sequence length, for shape dynamism in torch.export, from sequence length used for kv cache sizing. ghstack-source-id: 263653316 Differential Revision: [D68448334](https://our.internmc.facebook.com/intern/diff/D68448334/)
Stack from ghstack (oldest at bottom):
Decouple max sequence length, for shape dynamism in torch.export, from sequence
length used for kv cache sizing.
Differential Revision: D68448334
cc @mergennachin @cccclai @helunwencser @dvorjackz