<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>fast-axolotl notes</title><description>Engineering notes from the fast-axolotl maintainers: memory-bound LLM data pipelines, drop-in Rust shims for Python ML tools, and throughput measurement on Axolotl.</description><link>https://fast-axolotl.neullabs.com/</link><language>en-us</language><item><title>Measuring Axolotl throughput on memory-bound workloads</title><link>https://fast-axolotl.neullabs.com/blog/measuring-axolotl-throughput-memory-bound/</link><guid isPermaLink="true">https://fast-axolotl.neullabs.com/blog/measuring-axolotl-throughput-memory-bound/</guid><description>Tokens-per-second is a model-side metric. For data-bound Axolotl pipelines you need a memory-side metric: working-set, reader throughput, and the gap between them and the GPU&apos;s appetite. How fast-axolotl thinks about measurement, with the README benchmark numbers as the worked example.</description><pubDate>Wed, 06 May 2026 00:00:00 GMT</pubDate></item><item><title>Drop-in Rust extensions: the integration shape that works for OSS Python tools</title><link>https://fast-axolotl.neullabs.com/blog/drop-in-rust-extensions-integration-shape/</link><guid isPermaLink="true">https://fast-axolotl.neullabs.com/blog/drop-in-rust-extensions-integration-shape/</guid><description>What does &apos;drop-in&apos; actually mean when you&apos;re shipping a Rust accelerator into a living OSS Python project? Notes on the import-time shim shape, sys.modules patching, and why fast-axolotl can ride upstream Axolotl releases without a fork.</description><pubDate>Wed, 22 Apr 2026 00:00:00 GMT</pubDate></item><item><title>Why generic OOM-handling fails for &gt;100GB training datasets</title><link>https://fast-axolotl.neullabs.com/blog/oom-fails-for-large-training-datasets/</link><guid isPermaLink="true">https://fast-axolotl.neullabs.com/blog/oom-fails-for-large-training-datasets/</guid><description>Generic Python OOM strategies — chunk-on-error, swap-spillover, retry-with-smaller-batch — were designed for inference workloads. None of them keep up with a fine-tune that has to walk 100 GB of Parquet in a single epoch. Here&apos;s why streaming reads, not retry loops, are the only honest fix.</description><pubDate>Wed, 08 Apr 2026 00:00:00 GMT</pubDate></item></channel></rss>