
Fine-tuning is usually a blunt instrument. You adjust thousands or millions of parameters, cross your fingers that the model improves on your target task, and hope you haven't quietly broken something else. Goodfire just demonstrated a sharper approach: they removed a 67M-parameter language model's ability to speak German by tuning exactly one scalar value , and the rest of the model barely noticed.
The idea behind parameter decomposition
To understand what's happening here, you need to know about parameter decomposition , Goodfire's method for breaking a model's weight matrices into interpretable, rank-1 subcomponents (think of them as atomic building blocks of the model's learned behavior). The method, adVersarial Parameter Decomposition (VPD), optimizes for decompositions of neural network parameters into simple subcomponents that preserve the network's input-output behavior even when many subcomponents are ablated, including under ablations that are adversarially selected to destroy behavior.
Each subcomponent is a rank-1 matrix , the simplest possible matrix , with one "read" direction (what it looks for in the input) and one "write" direction (what it adds to the output). This encourages learning subcomponents that provide short, mechanistically faithful descriptions of the network's behavior that should aggregate appropriately into more global descriptions of the network's learned algorithm. Crucially, each subcomponent gets an auto-generated label describing what it activates on , e.g., "fires on German text and names."
Tuning one number to erase a language
The experiment was done as a one-day hackathon using Goodfire's Silico platform. The goal: destroy the model's ability to predict German text while keeping English intact. German was chosen because it was the model's strongest non-English language.
Don't miss what's next in AI
Join 300,000+ engineers and researchers who get the signal, not the noise.
- Full access to in-depth AI research breakdowns
- Be the first to know what's trending before it hits mainstream
- Daily curated papers, repos, and industry moves

