[LG] Less is More: Undertraining Experts Improves Model Upcycling
[Université de Montréal & Concordia University]
https://arxiv.org/abs/2506.14126