Jan 29, 2025
I read that they trained another model without Step 2B SFT, just more Step 2C RL, and it was almost but not quite as good. So maybe SFT isn't that important.
I read that they trained another model without Step 2B SFT, just more Step 2C RL, and it was almost but not quite as good. So maybe SFT isn't that important.
Nice to have a place where my writing can be ignored by millions