Writing
28 posts on machine learning, AI agents, and software.
2026
The Resumption Benchmark v0: Measuring Whether the Next Agent Continues Correctly Before I Run the Next Three Experiments, I'm Pre-Registering Them Brilliant but Amnesiac: The Coherence Cliff in Long-Horizon AI Agents Google 2026–2056: A 30-Year Game-Theory Forecast Organizations as Code: The Company Becomes a Repo I Got 91.55% on LongMemEval. Here's Why I'm Building a Different Benchmark. Benchmarking Brain on LongMemEval: 81.6% and the Product Lesson How I Built a Local-First Second Brain for Claude Code, OpenClaw, QMD, and MCP Context Engineering Is All I've Been Writing About I Built an AI Skill That Started Improving Itself Claude Code vs Gemini CLI: Two Philosophies of the Agentic Terminal Your MCP Servers Are Costing You 10 Seconds Per Session The Terminal Was the First Agent Harness The 14K Token Debt: System Prompt Architecture for Agentic AI My AI Agent's Memory Paid for Itself: How 590 Sessions Made Claude Code 1.8x Cheaper
2025
2024
The Dance of AI Agents: How Multi-Agent Systems Actually Work Building a Salesforce Slack Assistant: Lessons in AI-Powered Enterprise Tools Building AI-first Multiplayer Writing Tool for Creative Teams How Midjourney's Discord Became a Phenomenon: A Lesson in Community-Driven Growth OpenAI API vs Azure OpenAI: Choosing the Right Platform for Your AI Projects Universal Power of Computation
2023
2022