Blog

May 24, 2024

Not All Language Model Features Are Linear

Posted by in category: space

From MIT

Not all language model features are linear.

Recent work has proposed the linear representation hypothesis: that language models perform computation by manipulating one-dimensional representations of concepts (“features”) in activation space.


Join the discussion on this paper page.

Leave a reply