Blog

Apr 26
2024

Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding

Meta presents Layer Skip.

Enabling early exit inference and self-speculative decoding.

We present LayerSkip, an end-to-end solution to speed-up inference of large language models (LLMs).

Join the discussion on this paper page.

/* */