Written By: Qingyang Xu (et AI)
Last Modified: April 16, 2026
A useful definition is:
Pretraining learns a general next-token model over a huge corpus.
Post-training is everything after that, where we reshape the pretrained model into a system that is useful for a target domain, task family, or interaction style.
In modern practice, post-training typically includes some subset of:
The most important conceptual point is that post-training is not one algorithm. It is a stack.
Let $x$ be the prompt/context and $y=(y_1,\dots,y_T)$ the response. A causal LM defines
$$
\pi_\theta(y\mid x)=\prod_{t=1}^T \pi_\theta(y_t\mid x,y_{<t}), \qquad \log \pi_\theta(y\mid x)=\sum_{t=1}^T \log \pi_\theta(y_t\mid x,y_{<t}).
$$
Nearly all post-training methods change (\theta) by pushing probability mass toward “good” outputs and away from “bad” outputs, but they differ in what counts as good and what supervision signal is available.