Jan 272024 Paper page — FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design Microsoft presents FP6-LLM Efficiently serving large language models through fp6-centric algorithm-system co-design. Join the discussion on this paper page.