PSNR was the standard for decades and is bad at predicting human perception. SSIM was better. VMAF (Netflix) is now the de facto perceptual quality metric. Understanding what each captures helps you debug 'looks fine but scored low' regressions.
PSNR — pixel-level error
Mean squared error in dB. Doesn't capture spatial structure. A small uniform brightness shift kills PSNR but is imperceptible. Useful only as a sanity check.
SSIM — structural similarity
Compares luminance, contrast, structure in local windows. Better correlated with perception than PSNR. Saturates at high quality (can't distinguish good from excellent).
VMAF — perceptual + ML
Netflix-developed: combines several features (visual information fidelity, motion, detail loss) via a trained model. Best correlation with human MOS scores. Per-frame 0-100 score; ~90+ is 'good'. Open source; standard in encoder eval pipelines.