DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
Paper
•
2408.08152
•
Published
•
60
Note similar https://huggingface.co/papers/2402.18668
Note kinda similar https://arxiv.org/pdf/2402.02750.pdf
Note qmoe - https://arxiv.org/pdf/2310.16795.pdf