back
Get SIGNAL/NOISE in your inbox daily

New paper: Answer Matching Outperforms Multiple Choice for Language Model Evaluations. …