Reinforcement Learning for Reasoning in LLMs with One Training Example

		Reinforcement Learning for Reasoning in LLMs with One Training Example (arxiv.org)
		2 points by chrsw 10 months ago \| hide \| past \| favorite