元语音 [浏览需要 0 积分] 发布于 2天前 ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models 论文链接 语音 #Remax#LLM 浏览 (18) 点赞 收藏