Enhancing reinforcement learning with dense rewards from language model critic

Jan 1, 2024ยท
Meng Cao
,
Lei Shu
,
Lei Yu
,
Yun Zhu
,
Nevan Wichers
,
Yinxiao Liu
,
Lei Meng
ยท 0 min read
Type
Publication
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing