New top story on Hacker News: Steering interpretable language models with concept algebra

No comments