Skip to main content

Posts

Showing posts from March, 2026

Building Private AI: How to Keep Your Data Local with OpenClaw

Cloud AI means your data goes to cloud providers. What if it didn't have to? Last week, I watched a developer paste an entire customer database into ChatGPT to "analyze patterns." The data left their computer, went to OpenAI's servers, got processed, and theoretically got deleted. Theoretically. That's not acceptable for most businesses. The Problem With Cloud AI When you use ChatGPT, Claude, or any cloud API: Your data leaves your control It gets transmitted over the internet A third party company stores and processes it They might train on it (check the terms) It's subject to their privacy policies and government data requests You lose all compliance guarantees For casual use? Maybe fine. For healthcare, finance, legal, or sensitive business data? Absolutely not. Why Private AI is Actually Better Local AI isn't a step backward. It's a step forward. Security Your data never leaves your servers. Period. No internet tr...

Building Trustworthy AI: Beyond Benchmarks

Last month I was evaluating three frontier models for a client workflow at Publicis Sapient. One of them scored highest on every benchmark we checked. It was also the one that fell apart in production within two weeks. That experience pushed me to write this down, because I think the industry has a benchmark problem it isn't talking about honestly enough. Benchmarks Are Saturated and Getting Gamed MMLU and MMLU-Pro, two of the most cited evaluation benchmarks, are now functionally saturated above 88% for frontier models. The score differences between the top models are statistically meaningless at that level. Meanwhile, data contamination and annotation error rates above 50% undermine what these scores even measure in the first place. It gets worse. Most teams building internal benchmarks overestimate how well their models perform by 30% or more, because they test on clean inputs, cooperative conditions, and scenarios where the model's known strengths are on display. Tha...

From Single API to Network Intelligence: How Request Scout Changed My Debugging Workflow

A story about building smarter dev tools with your own AI The Moment It Clicked Last week, I was debugging a performance issue on a client's site. I had Chrome DevTools open, watching the Network tab with hundreds of requests flying by. I had Gemini in another window, pasting individual API responses, asking "What's this endpoint doing?" and "Are these headers correct?" Then it hit me: Why am I talking to Gemini about one API at a time, when I should be talking to it about my entire network? I opened my notebook and sketched something radical: What if I could ask my network tab questions directly? "Show me all failed API calls" "Which domains took longest?" "Find any requests with leaked auth tokens" "What's the pattern here?" Three days later, Request Scout was born. The Problem Nobody Talks About Network debugging is fragmented: 🕵️ You stare at the Network tab (human-scale = ~100 requests, max) 🔍...

250,000 AI Agent Instances Exposed on the Internet — Is Yours One of Them?

If You're Running OpenClaw, You May Want to Read This A public watchboard has surfaced listing over 250,000 OpenClaw instances that are directly reachable from the internet. Some of these instances have leaked credentials. Many are running on infrastructure already flagged for known CVEs and threat actor activity. This isn't theoretical. It's happening right now. You can check the exposure list yourself at openclaw.allegro.earth . Why This Is a Big Deal OpenClaw is a powerful AI agent framework. That power comes with serious responsibility. A typical OpenClaw deployment runs with: Personal API keys — OpenAI, Anthropic, Google, cloud provider credentials Broad system permissions — file access, shell execution, network requests Autonomous execution capabilities — the agent can act without human approval Complex codebases — large attack surfaces that haven't been fully audited When one of these instances is publicly reachable without authentication...