Tag

#llm

11 pieces of content

I Evaluated Fine-Tuning Across 3 Projects — None of Them Needed It

Three projects, three evaluations, zero cases where fine-tuning was justified. Here's the decision framework, the cost math, and why simpler approaches won every time.

Mar 2026 article

How ReAct Agents Recover from Their Own Mistakes

ReAct agents recover from their own mistakes — not because the model is clever, but because of how tools return errors and how the loop is structured. Here's what that looks like in practice.

Mar 2026 article

Letting the Model Pick Its Own Tools: How Tool Use Inverts Control Flow

The model autonomously combined keyword search and vector search in the optimal sequence — without being told to. Then I ran experiments to measure what vague descriptions, over-calling, and temperature actually do to tool selection.

Mar 2026 article

My LLM Pipeline Passed Every Manual Check — Then 36 Tests Proved Otherwise

Five manual runs looked fine. Then 36 automated tests exposed non-deterministic sourcing, biased scoring, and a confidence threshold that fired randomly.

Mar 2026 case-study

Building a Local PII Privacy Gate for AI Coding Assistants

How I built a hybrid regex + on-device LLM scanner that blocks prompts containing PII before they reach cloud APIs — zero data leaves the device.

Mar 2026 article

Regex vs On-Device LLM for PII Detection: A 25-Case Benchmark

A comprehensive benchmark comparing regex pattern matching against Apple's on-device Foundation Models for PII detection — 52% F1 vs 100% F1, and why binary classification beats extraction.

Mar 2026 article

On-Device vs Cloud LLM: A Practical Benchmark

Benchmarking Apple's on-device Foundation Models against cloud LLMs across commit message generation, code review, and text classification — latency, quality, cost, and privacy tradeoffs.

Mar 2026 experiment

Privacy-First Git Commit Message Generator

An on-device tool that analyzes git diffs and generates structured conventional commit messages — zero data leaves your machine.

Mar 2026 note

Session Reuse Eliminates Hangs in Batch LLM Processing

Mar 2026 article

How to Build an LLM-Powered UI Generator

A technical deep dive into building a system where users type plain English and get live HTML mockups that match a specific design system — grounding, token budgets, streaming, and security.

Feb 2026 note

Reducing LLM Token Usage by 44% with Selective Context Injection

Feb 2026