Science ❯ Computer Science ❯ Machine Learning ❯ Deep Learning
The compact codebase documents a four‑hour workflow on an 8×H100 node to train a 561M‑parameter chat model.