Skip to main content
The coding agent wars entered a new phase this week as survey data crowned Claude Code the most-used AI coding tool, while Cursor 2.0 and Windsurf Wave 13 shipped multi-agent and model-comparison features. NVIDIA GTC 2026 kicks off today with the Vera Rubin platform reveal, DeepSeek V4 details continue to surface, and a growing wave of AI-linked layoffs (45K jobs in 2026 so far) is forcing the industry to confront the gap between “AI as productivity tool” and “AI as headcount replacement.” Meanwhile, the open-source ecosystem is racing to build the connective tissue for agents: skills catalogs, memory plugins, hook systems, and extraction frameworks.

Highlight of the week

Claude Code takes the crown, but the real story is the toolchain

The Pragmatic Engineer’s March 2026 AI tooling survey (906 respondents) delivered a headline number: Claude Code is now the most-loved AI coding tool at 46%, far ahead of Cursor (19%) and GitHub Copilot (9%). In just eight months since its May 2025 launch, Claude Code has reached the adoption levels Copilot held three years ago. But the more interesting signal is what practitioners are actually doing. 95% of respondents use AI tools weekly, 56% report doing 70%+ of their engineering work with AI, and 55% regularly use AI agents. Most engineers chain several tools rather than picking one: Copilot for inline completions, Cursor or Windsurf for multi-file agentic work, and Claude Code for terminal automation and git workflows. Staff+ engineers lead agent adoption at 63.5%, which points to a real shift in senior engineering practice rather than hype. The survey also found that Anthropic’s Opus and Sonnet models dominate coding tasks by a wide margin, with more mentions than all other models combined. At small companies, 75% use Claude Code; large enterprises default to Copilot, likely driven by procurement inertia and Microsoft’s enterprise marketing.

Models and research

DeepSeek V4 remains half-launched. The trillion-parameter MoE model (roughly 32B active parameters per token) has been trickling out details since early March. It features a tiered KV-cache architecture that cuts memory needs by about 40%, sparse FP8 decoding for 1.8x inference speedup, and 1M+ token context. It is natively multimodal (text, image, video, audio) and optimized for Huawei Ascend chips. As of March 16, the full public release has not materialized, though Chinese tech media reported a website update with expanded context handling that some are calling “V4 Lite.” The delay strategy, whether intentional or not, is keeping attention on DeepSeek while competitors ship. Long-context is the new baseline. Claude Opus 4.6 sits at roughly 1M tokens, DeepSeek V4 targets 1M+, and GPT-5.3 offers around 400K. This enables workflows that were impractical a year ago: entire-repo analysis, book-length document handling, and large multi-document RAG pipelines in a single session. The practical impact is most visible in coding agents, where whole-repo context via embeddings plus large context windows has replaced single-file prompts as the standard. “Humanity’s Last Exam” exposes reasoning gaps. A Texas A&M-led group published what they call the hardest AI benchmark in Nature: 2,500 questions spanning mathematics, humanities, natural sciences, and ancient languages, sourced from nearly 1,000 researchers. GPT-4o scored 2.7%, Claude 3.5 Sonnet hit 4.1%, and o1 reached 8%. GPT-5 scored around 25% after the benchmark was published. The results are a useful corrective to benchmark-driven hype: frontier models are powerful coding and reasoning tools, but robust general reasoning remains unsolved. Reasoning efficiency over raw size. The frontier model conversation has shifted from parameter counts to output efficiency. Reports indicate GPT-5.x and Claude 4.6 variants are solving complex tasks with far fewer output tokens than their predecessors, prioritizing reliability and reduced hallucination over brute-force scaling.

Coding agents and dev tools

Cursor 2.0: parallel agents at scale. Cursor’s redesign ships up to eight parallel agents on the same codebase, each isolated via git worktrees or remote machines. The new proprietary “Composer” model runs 4x faster than comparable alternatives. Internally, 35% of Cursor’s merged PRs are now created by autonomous agents in cloud VMs. Version 2.2 adds multi-agent judging, where Cursor automatically evaluates all parallel runs and recommends the best solution. Windsurf Wave 13 brings Arena Mode. Windsurf’s latest release pits two models against each other side-by-side with hidden identities, letting you vote on which performs better for your specific codebase and tasks. It acknowledges something obvious that most tools ignore: benchmark rankings do not predict real-world performance on your code. The companion Plan Mode builds explicit task plans before editing, with a “megaplan” option for thorough interactive planning. Wave 13 also adds first-class multi-agent sessions, git worktrees, and side-by-side Cascade panes. Google Antigravity: free, for now. Google’s agent-first IDE launched in public preview with Claude Opus 4.6 and Gemini 3 Pro at no cost, scoring 76.2% on SWE-bench with five parallel agents. But the sustainability question is already answering itself: reports of multi-day account lockouts, a 92% free-tier quota cut, and a $250/month paid plan suggest the free tier was a land-grab, not a business model. OpenAI Skills catalog for Codex. OpenAI launched an official Skills repository (8.2K stars) with 35+ curated skills organized in three tiers: system (bundled automatically), curated (install by name), and experimental (community contributions). This is OpenAI’s response to the grassroots plugin ecosystem forming around Claude Code. A separate community-built library (1.7K stars) offers 66 specialized skills for full-stack developers using Claude Code. The compound engineering pattern goes mainstream. Every’s compound engineering plugin for Claude Code (8.7K stars) implements multi-agent orchestration, spawning sub-agents for parallel work. Combined with disler’s hooks mastery repo (3K stars), the Claude Code extension ecosystem is growing quickly. The pattern: treat agents as orchestrators that delegate to specialized sub-agents instead of running them as single-threaded assistants.

Web development and frameworks

AI coding tools now handle web workflows well. Nearly all leading assistants (Copilot, Cursor, Windsurf, Vercel v0, Bolt.new, Lovable, Gemini CLI, Claude Code) can generate React components, perform design-to-code conversion, and edit across multiple frontend files and routes. “Start from a Figma screenshot or prompt” is becoming a standard workflow for React and modern frontend stacks. What matters more: agentic IDEs now ingest the whole codebase (front and back end), adjusting routing, API handlers, and frontend components together, and can run tests or dev servers as part of their execution plans. Full-stack refactoring that previously required careful coordination across files is becoming a single-prompt operation. The Open Brain visual interface pattern, covered in a video this week, demonstrates a practical architecture for agent-human shared data: one Supabase table as the single source of truth, MCP as the agent access layer, and a lightweight web app as the human interface. The design principle is simple: agent surfaces patterns, human decides, agent executes. It works because it keeps humans in the loop without making the agent useless.

Infrastructure

NVIDIA GTC 2026 opens today. Jensen Huang’s keynote (March 16, 11am PT) is expected to detail the Vera Rubin platform: 72-GPU NVL72 rack systems with HBM4 memory, 3.3x to 5x inference performance over Blackwell Ultra on FP4 workloads, 260 TB/s aggregate NVLink bandwidth, and a 10x reduction in inference token costs. Vera Rubin entered full-scale production in early 2026. NVIDIA will also emphasize its five-layer AI stack (energy, chips, infrastructure, models, applications) and the “AI factories” concept, plus sessions covering Isaac and Omniverse for physical AI. Future Rubin Ultra and post-Rubin “Feynman” architectures are expected to be teased. For practitioners, the key number is the 10x inference cost reduction. If that holds in production, it changes the economics of running agentic workloads (multi-agent coding, autonomous security testing, large-context retrieval) at scale.

Industry and business

The layoff-to-fund-AI pattern accelerates. 45,000 tech jobs have been cut in 2026 so far, with roughly 9,200 (20%) directly attributed to AI-driven restructuring. Block cut 4,000 roles (40% of workforce) with CEO Jack Dorsey explicitly citing AI capabilities. Meta is reportedly planning to cut up to 20% of its workforce to offset AI infrastructure spending, a story that moved Meta stock up nearly 3%. Companies are simultaneously increasing AI spending and cutting headcount, and markets are rewarding both moves. **OpenAI at 840B.The[latest840B.** The [latest 110B funding round](https://news.crunchbase.com/venture/openai-raise-largest-ai-venture-deal-ever/) (SoftBank 30B,NVIDIA30B, NVIDIA 30B, Amazon 50B)putsOpenAIatan50B) puts OpenAI at an 840B valuation, up from $500B just five months ago. Microsoft did not participate. OpenAI claims 900M+ weekly active users and 50M+ consumer subscribers. An IPO is expected this year. Whether these numbers justify the valuation depends entirely on whether you believe the current trajectory of model improvement continues. Morgan Stanley predicts “massive AI breakthrough” in H1 2026. The investment bank’s report cites executives at major AI labs telling investors to brace for progress that will “shock” them, predicting AI will become a “powerful deflationary force.” They also project a net US power shortfall of 9 to 18 gigawatts through 2028 to run the required infrastructure. Take the timeline with salt, but the directional bet is notable.

Interesting GitHub repositories

KeygraphHQ/shannon (21.2K stars, +16.8K this week). Fully autonomous AI pentester for web app security testing. It reads source code, maps attack surfaces, and executes real exploits. Scored 96.15% on a hint-free XBOW benchmark (100/104 exploits). TypeScript. 16.8K stars gained in a single week. Worth watching for where automated security testing in CI/CD pipelines is heading. thedotmack/claude-mem (27.6K stars). Claude Code plugin that auto-captures everything Claude does during coding sessions, compresses it with AI, and injects relevant context into future sessions. Addresses the persistent memory gap in agent workflows where every session starts from zero. google/langextract (31.4K stars). Google’s Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization. Version 0.5.0 (February 2026) introduced enhanced semantic chunking and improved multi-pass extraction, increasing recall by 22%. Supports Gemini, Ollama, and other backends. Google open-sourcing production-grade extraction tooling matters if you’re building RAG or data pipelines. badlogic/pi-mono (11.3K stars). AI agent toolkit from libGDX creator Mario Zechner: coding agent CLI, unified LLM API, TUI and web UI libraries, Slack bot, vLLM pods. One of the more complete open-source agent infrastructure stacks currently trending. tobi/qmd (8.2K stars). From Shopify founder Tobi Lutke: a mini CLI search engine for your docs, knowledge bases, and meeting notes. Fully local, tracks current SOTA approaches. The fact that Tobi is building and open-sourcing personal knowledge tooling says something about where high-signal builders see value. virattt/dexter (14.9K stars). Autonomous agent for deep financial research. One of several domain-specific research agents gaining traction. Sustained growth suggests real utility beyond demo-ware. openai/skills (8.2K stars). Official Skills Catalog for Codex with 35+ reusable skills in three tiers (system, curated, experimental). OpenAI’s structured approach to agent extensibility.

Quick bits

  • AI-assisted design-to-code is table stakes. Every major coding tool now advertises automatic component generation and UI scaffolding from screenshots or prompts. Differentiation has shifted to agent orchestration and codebase safety. (LogRocket)
  • Dense wave of US state AI bills. 78 chatbot bills alive in 27 states covering provenance metadata, deepfake protections, AI-only decision bans in workers’ comp, restrictions on AI in psychotherapy, and environmental accountability for data centers. Utah’s Digital Content Provenance Standards Act has passed the legislature. (Transparency Coalition)
  • DOJ creates AI Litigation Task Force. The FTC is preparing a policy statement on how unfair-and-deceptive-practices authority applies to AI. The White House is circulating draft language requiring AI firms selling to federal markets to support “any lawful use.” (JD Supra)
  • prompt-optimizer (20.8K stars): Chinese-language meta-tooling for optimizing prompts, 20K+ stars on a prompt optimization tool tells you something about the appetite for meta-tooling. (GitHub)
  • grok2api (1.4K stars): FastAPI-based Grok adapter with streaming, image generation, and pool concurrency. Part of the ongoing trend of open-source API adapters unifying access to different LLM providers. (GitHub)
  • SaaS platforms quietly adopting frontier models. Writing and marketing platforms (Jasper, Copy.ai) have shipped Claude Sonnet 4.6 and GPT-5.3 support, evidence of how fast downstream products integrate new APIs. (blog.mean.ceo)

Sources

Last modified on April 14, 2026