Functional faithfulness in the wild: Circuit discovery with differentiable computation graph pruningJan 1, 2024ยทLei Yu,Jingcheng Niu,Zining Zhu,Gerald Pennยท 0 min read CiteTypeJournal articlePublicationarXiv preprint arXiv:2407.03779Last updated on Jan 1, 2024 ← Enhancing reinforcement learning with dense rewards from language model critic Jan 1, 2024Geometric Signatures of Compositionality Across a Language Model's Lifetime Jan 1, 2024 →