Microsoft logoMicrosoft
Coding·60 minMembers

SFT Sample Packing

Members only

Implement supervised fine-tuning sample packing: greedily pack variable-length (prompt, answer) pairs into fixed-length training sequences, emit per-token loss masks aligned to answer positions only, ...

MLE
pytorch
data-engineering
batching
hard
Frequency
Single report · New
Last asked
2026-04-19
Stage
onsite-coding

Log in to continue reading the full content