Blog

Apr 12
2024

Rho-1: Not All Tokens Are What You Need

Microsoft presents Rho-1

Previous language model pre-training methods have uniformly applied a next-token #prediction #LOSS to all training #tokens.

Join the discussion on this paper page.

/* */