Toggle light / dark theme

LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference

Qichen Fu, Minsik Cho, Thomas Merth, Sachin Mehta, Mohammad Rastegari, Mahyar Najibi Apple & Meta 2024


Join the discussion on this paper page.