- ๐ฑ Iโm a PhD candidate at Nanyang Technological University and the Institute for Infocomm Research (I2R), A*STAR.
- ๐ญ Iโm currently working on machine reading comprehension and reasoning.
sparkjiao / dpo-trajectory-reasoning Goto Github PK
View Code? Open in Web Editor NEWSource code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".