Blog

Jul 22, 2024

LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference

Posted by in category: futurism

Qichen Fu, Minsik Cho, Thomas Merth, Sachin Mehta, Mohammad Rastegari, Mahyar Najibi Apple & Meta 2024


Join the discussion on this paper page.

Leave a reply