Microsoft presents FP6-LLM
Efficiently serving large language models through fp6-centric algorithm-system co-design.
Join the discussion on this paper page.
Microsoft presents FP6-LLM
Efficiently serving large language models through fp6-centric algorithm-system co-design.
Join the discussion on this paper page.
Comments are closed.