Tag

#python

23 pieces of content

I Evaluated Fine-Tuning Across 3 Projects — None of Them Needed It

Three projects, three evaluations, zero cases where fine-tuning was justified. Here's the decision framework, the cost math, and why simpler approaches won every time.

Mar 2026 article

How ReAct Agents Recover from Their Own Mistakes

ReAct agents recover from their own mistakes — not because the model is clever, but because of how tools return errors and how the loop is structured. Here's what that looks like in practice.

Mar 2026 article

Letting the Model Pick Its Own Tools: How Tool Use Inverts Control Flow

The model autonomously combined keyword search and vector search in the optimal sequence — without being told to. Then I ran experiments to measure what vague descriptions, over-calling, and temperature actually do to tool selection.

Mar 2026 article

When Embeddings Fail: Why Vector Search Can't Judge Capability

I added vector search to the screening pipeline and watched it rank a junior frontend developer above a Principal Engineer who processed 1B+ events/day. The embedding model matched vocabulary, not capability.

Mar 2026 article

My LLM Pipeline Passed Every Manual Check — Then 36 Tests Proved Otherwise

Five manual runs looked fine. Then 36 automated tests exposed non-deterministic sourcing, biased scoring, and a confidence threshold that fired randomly.

Mar 2026 article

Auditing My AI Systems: Patterns, Tradeoffs, and Gaps I Was Working Around

I catalogued every AI decision across three production systems and found a consistent pattern — along with five gaps I'd been working around instead of solving.

Mar 2026 case-study

Building a Local PII Privacy Gate for AI Coding Assistants

How I built a hybrid regex + on-device LLM scanner that blocks prompts containing PII before they reach cloud APIs — zero data leaves the device.

Mar 2026 article

Regex vs On-Device LLM for PII Detection: A 25-Case Benchmark

A comprehensive benchmark comparing regex pattern matching against Apple's on-device Foundation Models for PII detection — 52% F1 vs 100% F1, and why binary classification beats extraction.

Mar 2026 experiment

Privacy-First Git Commit Message Generator

An on-device tool that analyzes git diffs and generates structured conventional commit messages — zero data leaves your machine.

Mar 2026 note

Session Reuse Eliminates Hangs in Batch LLM Processing

Mar 2026 note

Building a Live Terminal Dashboard for AI Coding Sessions

Mar 2026 article

How to Build an LLM-Powered UI Generator

A technical deep dive into building a system where users type plain English and get live HTML mockups that match a specific design system — grounding, token budgets, streaming, and security.

Feb 2026 note

Reducing LLM Token Usage by 44% with Selective Context Injection

Feb 2026 article

Building FastAPI Services on Kubernetes

How I structure FastAPI applications for Kubernetes deployment — from project layout to health checks, pod templates, and CI/CD.

Mar 2025 note

Always Profile Before You Optimize

Feb 2025 case-study

Optimizing a High-Throughput FastAPI Service

How I achieved 50% lower latency and 40% higher throughput on a critical FastAPI service through async pipelines, SQL optimization, and strategic caching.

Feb 2025 article

Async Python in Production: What They Don't Tell You

Async improves throughput but introduces debugging complexity, connection pool pitfalls, and error handling surprises. Lessons from running async APIs at scale.

Jan 2025 case-study

Building a Modular Rule Engine for Credit Decisioning

How I designed a plugin-based rule engine that automated credit decisions, reduced manual review by 30%, and made rule changes deployable in hours instead of sprints.

Jan 2025 note

Correlation IDs Go in Middleware, Not App Code

Dec 2024 experiment

Kubernetes Pod Template Generator

A tool for generating production-ready Kubernetes pod templates with best practices baked in.

Nov 2024 note

Structured Logging Is a Library, Not a Guideline

Oct 2024 experiment

Restaurant Insights — NLP-Powered Dining Recommendations

A tool that finds nearby restaurants, analyzes reviews with NLP, and checks for parking — built in 3 days with APIs, then rebuilt in 3 minutes with AI prompts.

Sep 2024 experiment

MindInChess — Chess Analysis App

A chess analysis app using Stockfish for blunder detection, accuracy scoring, and color-coded move insights with PGN upload and Chess.com/Lichess import support.

Aug 2024