Blog

Mar 9, 2024

Paper page — AtP*: An efficient and scalable method for localizing LLM behaviour to components

Posted by in category: computing

Google presents AtP

An efficient and scalable method for localizing LLM behaviour to components.

Activation Patching is a method of directly computing causal attributions of behavior to model components.


Join the discussion on this paper page.

Comments are closed.