Tag

#performance

7 pieces of content

note

Session Reuse Eliminates Hangs in Batch LLM Processing

Mar 2026 note

Reducing LLM Token Usage by 44% with Selective Context Injection

Feb 2026 note

Always Profile Before You Optimize

Feb 2025 case-study

Optimizing a High-Throughput FastAPI Service

How I achieved 50% lower latency and 40% higher throughput on a critical FastAPI service through async pipelines, SQL optimization, and strategic caching.

Feb 2025 article

Async Python in Production: What They Don't Tell You

Async improves throughput but introduces debugging complexity, connection pool pitfalls, and error handling surprises. Lessons from running async APIs at scale.

Jan 2025 article

Why CPU-Based Autoscaling Fails for API Services

CPU utilization is the wrong signal for scaling API services. Here's why request latency-based HPA produces better scaling behavior and how to implement it.

Sep 2024 experiment

MindInChess — Chess Analysis App

A chess analysis app using Stockfish for blunder detection, accuracy scoring, and color-coded move insights with PGN upload and Chess.com/Lichess import support.

Aug 2024