back
Get SIGNAL/NOISE in your inbox daily

Summary
* We introduce an interpretability-based technique for controlling how fine-tuned LLMs generalize out-of-distribution, without modifying tra…