TinyZero is a reproduction of DeepSeek R1 Zero in countdown and multiplication tasks. We built upon veRL. conda create -n zero python=3.9 # install torch [or you can skip this step and let vllm to ...
Some results have been hidden because they may be inaccessible to you