Blog

Apr 26, 2024

Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding

Posted by in category: futurism

Meta presents Layer Skip.

Enabling early exit inference and self-speculative decoding.

We present LayerSkip, an end-to-end solution to speed-up inference of large language models (LLMs).


Join the discussion on this paper page.

Comments are closed.