Blog

Jul 22
2024

LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference

Qichen Fu, Minsik Cho, Thomas Merth, Sachin Mehta, Mohammad Rastegari, Mahyar Najibi Apple & Meta 2024

Join the discussion on this paper page.

/* */