[LG] TreeRL: LLM Reinforcement Learning with On-Policy Tree Search
[Tsinghua University & California Institute of Technology]
https://arxiv.org/abs/2506.11902