// notes
Engineering notes from the data pipeline.
Three threads we keep coming back to: where generic OOM handling fails, what makes a drop-in Rust shim actually drop in, and how to measure Axolotl throughput when memory — not compute — is the constraint.
-
Measuring Axolotl throughput on memory-bound workloads
Tokens-per-second is a model-side metric. For data-bound Axolotl pipelines you need a memory-side metric: working-set, reader throughput, and the gap between them and the GPU's appetite. How fast-axolotl thinks about measurement, with the README benchmark numbers as the worked example.
benchmarksthroughputaxolotl -
Drop-in Rust extensions: the integration shape that works for OSS Python tools
What does 'drop-in' actually mean when you're shipping a Rust accelerator into a living OSS Python project? Notes on the import-time shim shape, sys.modules patching, and why fast-axolotl can ride upstream Axolotl releases without a fork.
rustpythonintegration -
Why generic OOM-handling fails for >100GB training datasets
Generic Python OOM strategies — chunk-on-error, swap-spillover, retry-with-smaller-batch — were designed for inference workloads. None of them keep up with a fine-tune that has to walk 100 GB of Parquet in a single epoch. Here's why streaming reads, not retry loops, are the only honest fix.
memoryaxolotldata-pipeline