This is the first issue of Weekly AI Signals. The goal is simple: each Monday, a short, dense read on what actually changed in AI — across research, products, and infrastructure — written so an executive can skim the top, a builder can lift a takeaway, and a practitioner can dive deep.
If you find this useful, the best thing you can do is forward it to one person who'd appreciate it.
For builders
OpenAI
First-class durable execution, structured handoffs between agents, and tracing that maps cleanly to OpenTelemetry. If you're building multi-step agents, the ergonomics here are worth a serious look — especially the handoff primitive, which removes a lot of glue code.
Latent Space
Walks through building task-completion evals for agents that use tools, with concrete code. The framing — score the trajectory, not just the final answer — is the right mental model for anyone shipping agents to production.
Hugging Face
Closes most of the gap with the leading closed reasoning models on math and code benchmarks, MIT-licensed, and runs on a single H100 with quantization. The cost-per-token math now favors self-hosting for many reasoning workloads.
Cloudflare
Durable, step-based execution at the edge, with first-class support for long-running LLM calls and human-in-the-loop steps. Worth comparing against Inngest and Temporal if you're picking infra for an agent product this quarter.
Anthropic
Engineering post on the metrics that matter once an agent is live: task completion rate, intervention rate, time-to-recovery from a stuck state. Pragmatic and refreshingly honest about how often agents get stuck.