// notes

Engineering notes from the data pipeline.

Three threads we keep coming back to: where generic OOM handling fails, what makes a drop-in Rust shim actually drop in, and how to measure Axolotl throughput when memory — not compute — is the constraint.

Measuring Axolotl throughput on memory-bound workloads
May 6, 2026

Tokens-per-second is a model-side metric. For data-bound Axolotl pipelines you need a memory-side metric: working-set, reader throughput, and the gap between them and the GPU's appetite. How fast-axolotl thinks about measurement, with the README benchmark numbers as the worked example.

benchmarksthroughputaxolotl
Drop-in Rust extensions: the integration shape that works for OSS Python tools
Apr 22, 2026

What does 'drop-in' actually mean when you're shipping a Rust accelerator into a living OSS Python project? Notes on the import-time shim shape, sys.modules patching, and why fast-axolotl can ride upstream Axolotl releases without a fork.

rustpythonintegration
Why generic OOM-handling fails for >100GB training datasets
Apr 8, 2026

Generic Python OOM strategies — chunk-on-error, swap-spillover, retry-with-smaller-batch — were designed for inference workloads. None of them keep up with a fine-tune that has to walk 100 GB of Parquet in a single epoch. Here's why streaming reads, not retry loops, are the only honest fix.

memoryaxolotldata-pipeline

Engineering notes from the data pipeline.

Measuring Axolotl throughput on memory-bound workloads

Drop-in Rust extensions: the integration shape that works for OSS Python tools

Why generic OOM-handling fails for >100GB training datasets