OpenAI just announced JalapeƱo , its first custom inference processor, built in partnership with Broadcom and taped out in just nine months. If the cost numbers hold, this is a structural shift in how OpenAI runs its models, and it eventually affects what builders pay to call the API. What JalapeƱo Actually Is JalapeƱo is an inference-only ASIC (application-specific integrated circuit). Not a training chip. Inference is what runs every time you call gpt-4o or o3 . That's where the compute costs actually land at scale. The chip is built on TSMC's 3nm process node, the same manufacturing tier Apple uses for its A18 Pro. It's a reticle-sized die, meaning it's about as large as a chip can physically be before yield becomes a serious problem at that node. The package includes one large compute chiplet surrounded by eight HBM (high-bandwidth memory) stacks. HBM is what you need for LLM inference: huge memory bandwidth, physically close to the compute. GPUs do this too, b...
GitHub Copilot switched to usage-based billing on June 1, 2026. If your team didn't notice until the invoice arrived, you're not alone. This is the biggest change to Copilot's pricing model since launch, and the developer community's response was clear: over 900 downvotes and 400 comments on GitHub's own announcement thread in the first week. Here's what actually changed, who got hurt, and what to do before next month's bill lands. What the Old Model Was The previous system used Premium Request Units (PRUs). Your plan came with a fixed monthly allotment. When you burned through it, Copilot didn't cut you off. It quietly fell back to a lighter base model. You kept working. You just didn't know you'd dropped to a less capable model. That was a reasonable trade-off for predictability. That safety net is gone. The New Model: AI Credits Every plan now ships with a monthly AI Credits allowance. One credit costs $0.01. The plan prices didn't...