Meta recently announced its new breakthrough in the field of large model training, introducing a novel reinforcement learning method called LSP (Language Self-Play). This method aims to address the ...