You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SimLayerKV dynamically identifies lazy layers in LLMs—layers that focus primarily on initial and recent tokens allowing selective KV cache trimming. This reduces memory usage during inference without requiring additional training, making it more efficient and adaptive than static methods.
Feature
SimLayerKV dynamically identifies lazy layers in LLMs—layers that focus primarily on initial and recent tokens allowing selective KV cache trimming. This reduces memory usage during inference without requiring additional training, making it more efficient and adaptive than static methods.
Paper
SimLayerKV
github
not sure if this has been added but if not I would like to work on it
The text was updated successfully, but these errors were encountered: