fast-axolotl notes

fast-axolotl notesEngineering notes from the fast-axolotl maintainers: memory-bound LLM data pipelines, drop-in Rust shims for Python ML tools, and throughput measurement on Axolotl.https://fast-axolotl.neullabs.com/en-usMeasuring Axolotl throughput on memory-bound workloadshttps://fast-axolotl.neullabs.com/blog/measuring-axolotl-throughput-memory-bound/https://fast-axolotl.neullabs.com/blog/measuring-axolotl-throughput-memory-bound/Tokens-per-second is a model-side metric. For data-bound Axolotl pipelines you need a memory-side metric: working-set, reader throughput, and the gap between them and the GPU's appetite. How fast-axolotl thinks about measurement, with the README benchmark numbers as the worked example.Wed, 06 May 2026 00:00:00 GMTDrop-in Rust extensions: the integration shape that works for OSS Python toolshttps://fast-axolotl.neullabs.com/blog/drop-in-rust-extensions-integration-shape/https://fast-axolotl.neullabs.com/blog/drop-in-rust-extensions-integration-shape/What does 'drop-in' actually mean when you're shipping a Rust accelerator into a living OSS Python project? Notes on the import-time shim shape, sys.modules patching, and why fast-axolotl can ride upstream Axolotl releases without a fork.Wed, 22 Apr 2026 00:00:00 GMTWhy generic OOM-handling fails for >100GB training datasetshttps://fast-axolotl.neullabs.com/blog/oom-fails-for-large-training-datasets/https://fast-axolotl.neullabs.com/blog/oom-fails-for-large-training-datasets/Generic Python OOM strategies — chunk-on-error, swap-spillover, retry-with-smaller-batch — were designed for inference workloads. None of them keep up with a fine-tune that has to walk 100 GB of Parquet in a single epoch. Here's why streaming reads, not retry loops, are the only honest fix.Wed, 08 Apr 2026 00:00:00 GMT