Engineering Notes & Write-ups
A technical walkthrough of building sivabuilds.com — the stack decisions, content pipeline, knowledge graph architecture, and the gotchas worth documenting.
A comprehensive benchmark comparing regex pattern matching against Apple's on-device Foundation Models for PII detection — 52% F1 vs 100% F1, and why binary classification beats extraction.
Benchmarking Apple's on-device Foundation Models against cloud LLMs across commit message generation, code review, and text classification — latency, quality, cost, and privacy tradeoffs.
A technical deep dive into building a system where users type plain English and get live HTML mockups that match a specific design system — grounding, token budgets, streaming, and security.
How AI workflows evolve from simple prompts into structured systems of skills, agents, and collaborative teams — patterns observed while experimenting with Claude Code.
How I structure FastAPI applications for Kubernetes deployment — from project layout to health checks, pod templates, and CI/CD.
Async improves throughput but introduces debugging complexity, connection pool pitfalls, and error handling surprises. Lessons from running async APIs at scale.
CPU utilization is the wrong signal for scaling API services. Here's why request latency-based HPA produces better scaling behavior and how to implement it.