Knowledge

Long-form technical writing on LiteLLM, Rust, Python performance, and production LLM operations. Three flavors: guides (hands-on walkthroughs), articles (opinions and deep dives), and issue analyses (independent technical writeups of long-standing open issues in BerriAI/LiteLLM).

Issue analyses are commentary on a project we don't maintain. Neul Labs is not affiliated with BerriAI.

Guides

Hands-on, opinionated walkthroughs you can act on.

Guide tutorial · Apr 12, 2026

Accelerating the LiteLLM proxy with Fast LiteLLM

A production-ready guide to running the LiteLLM proxy server with Fast LiteLLM under gunicorn, Docker, and systemd — including the import-order trap that catches most teams.
Guide getting-started · Apr 12, 2026

Installing Fast LiteLLM

How to install Fast LiteLLM, verify the Rust acceleration is active, and what to do when it isn't.
Guide performance · Apr 12, 2026

Rate limiting LiteLLM at high cardinality

Why per-user rate limiting in pure Python eats memory at scale, and how Fast LiteLLM gets to 42× less RSS without changing your config.

Articles

Long-form thinking and arguments.

LiteLLM Issue Analyses

Deep technical analysis of long-standing upstream LiteLLM issues. Sorted by upstream reactions.

Knowledge

Guides

Accelerating the LiteLLM proxy with Fast LiteLLM

Installing Fast LiteLLM

Rate limiting LiteLLM at high cardinality

Articles

Why LiteLLM needs Rust (in three specific places, not everywhere)

A deep dive into the Fast LiteLLM token counting benchmark

LiteLLM Issue Analyses

When PyPI maintainer accounts get hijacked: the LiteLLM 1.82.7/1.82.8 supply-chain compromise

Why `import litellm` takes a second, and what it would take to fix it

Bisecting the LiteLLM 1.80 → 1.81 performance regression

The aiohttp `Unclosed client session` warnings in LiteLLM, explained

LiteLLM proxy on Python 3.14: an uvloop ABI break post-mortem

The LiteLLM Router silently drops async callbacks — here's where

Why LiteLLM burns extra GitHub Copilot premium requests on agent flows

Why your LiteLLM Prometheus metrics flicker under multiple workers