<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>Engineer&apos;s Codex</title><description>Tech news, distilled from the signal.</description><link>https://engineerscodex.com/</link><item><title>LLMs Can Improve at Code by Training on Their Own Wrong Answers</title><link>https://engineerscodex.com/llm-self-distillation-code/</link><guid isPermaLink="true">https://engineerscodex.com/llm-self-distillation-code/</guid><description>Simple Self-Distillation (SSD) lets LLMs improve at code generation by training on their own unverified outputs, no correctness labels or execution environment needed. Qwen3-30B jumps 12.9 points on LiveCodeBench v6.</description><pubDate>Sun, 05 Apr 2026 00:00:00 GMT</pubDate><category>ai</category><category>research</category><category>coding</category><category>training</category><category>llms</category></item><item><title>Most Coding Agents Break 75%+ of Their Own Fixes Over Time</title><link>https://engineerscodex.com/swe-ci-coding-agent-benchmark/</link><guid isPermaLink="true">https://engineerscodex.com/swe-ci-coding-agent-benchmark/</guid><description>SWE-CI is a new benchmark that evaluates coding agents on long-term codebase maintenance via continuous integration loops — not one-shot bug fixes. Most models introduced regressions on 75%+ of tasks. Only Claude Opus exceeded a 50% zero-regression rate.</description><pubDate>Sun, 08 Mar 2026 00:00:00 GMT</pubDate><category>ai</category><category>research</category><category>agents</category><category>benchmarks</category><category>coding</category></item><item><title>Claude Just Solved an Open Math Problem That Had Stumped Researchers for Weeks</title><link>https://engineerscodex.com/knuth-claude-cycles-open-problem/</link><guid isPermaLink="true">https://engineerscodex.com/knuth-claude-cycles-open-problem/</guid><description>Don Knuth published a paper describing how Claude Opus 4.6 solved an open combinatorics problem he&apos;d been working on for weeks: finding a general decomposition of a 3D digraph&apos;s arcs into three directed Hamiltonian cycles. Claude found it in about one hour across 31 explorations.</description><pubDate>Sun, 08 Mar 2026 00:00:00 GMT</pubDate><category>ai</category><category>research</category><category>ai</category><category>agents</category><category>reasoning</category></item><item><title>Claude Adds Ability to Import Memory From Other AI Providers</title><link>https://engineerscodex.com/claude-import-memory-from-providers/</link><guid isPermaLink="true">https://engineerscodex.com/claude-import-memory-from-providers/</guid><description>Anthropic added a memory import tool that lets you copy your context and preferences from ChatGPT, Gemini, or any other AI provider into Claude in under a minute.</description><pubDate>Sun, 01 Mar 2026 00:00:00 GMT</pubDate><category>ai</category><category>claude</category><category>ai</category><category>anthropic</category><category>memory</category></item><item><title>The Answer Key Trick That Cuts Reasoning LLM Training Time in Half</title><link>https://engineerscodex.com/a-star-po-faster-rl-llm-training/</link><guid isPermaLink="true">https://engineerscodex.com/a-star-po-faster-rl-llm-training/</guid><description>A*-PO is a new RL training algorithm for LLMs that precomputes an &apos;optimal value&apos; offline, then trains with just one sample per prompt instead of many. It matches or beats PPO and GRPO at up to 2x faster speed and 30%+ lower memory.</description><pubDate>Sun, 01 Mar 2026 00:00:00 GMT</pubDate><category>ai</category><category>research</category><category>llm</category><category>training</category><category>reinforcement-learning</category></item><item><title>LLMs Can Now Figure Out Who&apos;s Behind Any Pseudonym — For Just $4</title><link>https://engineerscodex.com/llm-deanonymization-pseudonymity/</link><guid isPermaLink="true">https://engineerscodex.com/llm-deanonymization-pseudonymity/</guid><description>Researchers from ETH Zurich and Anthropic show that LLM agents can re-identify pseudonymous online accounts at scale — achieving up to 68% recall at 90% precision compared to near 0% for the best classical methods. The assumption that posting under a pseudonym is safe no longer holds.</description><pubDate>Sun, 01 Mar 2026 00:00:00 GMT</pubDate><category>ai</category><category>privacy</category><category>security</category><category>research</category><category>llm</category><category>agents</category></item><item><title>Block Announces Layoffs of 4,000 People, Over 40% Cut</title><link>https://engineerscodex.com/block-layoffs-4000-ai/</link><guid isPermaLink="true">https://engineerscodex.com/block-layoffs-4000-ai/</guid><description>Jack Dorsey announces Block is cutting over 4,000 employees — nearly half its workforce — citing AI-driven changes to how companies operate. The stock jumped almost 25% after hours.</description><pubDate>Fri, 27 Feb 2026 00:00:00 GMT</pubDate><category>industry</category><category>block</category><category>layoffs</category><category>ai</category><category>jack-dorsey</category><category>industry</category></item><item><title>Claude Offers Free 6-Month Claude Max Memberships for Open-Source Maintainers</title><link>https://engineerscodex.com/claude-free-max-open-source/</link><guid isPermaLink="true">https://engineerscodex.com/claude-free-max-open-source/</guid><description>Anthropic is giving up to 10,000 open-source maintainers free Claude Max 20x subscriptions for six months.</description><pubDate>Fri, 27 Feb 2026 00:00:00 GMT</pubDate><category>devtools</category><category>anthropic</category><category>claude</category><category>open-source</category><category>developer-tools</category></item><item><title>Claude Code Now Remembers What It Learns Across Sessions</title><link>https://engineerscodex.com/claude-code-auto-memory/</link><guid isPermaLink="true">https://engineerscodex.com/claude-code-auto-memory/</guid><description>Anthropic shipped auto-memory for Claude Code. Claude now persists project context, debugging patterns, and preferences across sessions without manual setup.</description><pubDate>Thu, 26 Feb 2026 00:00:00 GMT</pubDate><category>devtools</category><category>claude-code</category><category>ai</category><category>developer-tools</category><category>anthropic</category></item><item><title>Google Restricts AI Ultra Accounts Over OpenClaw OAuth</title><link>https://engineerscodex.com/google-restricts-ultra-accounts-openclaw/</link><guid isPermaLink="true">https://engineerscodex.com/google-restricts-ultra-accounts-openclaw/</guid><description>Google locked AI Ultra subscribers out of Gemini models for using OpenClaw OAuth, with no warning or explanation. Anthropic banned third-party access two days earlier.</description><pubDate>Sun, 22 Feb 2026 00:00:00 GMT</pubDate><category>devtools</category><category>ai</category><category>oauth</category><category>google</category><category>anthropic</category><category>security</category></item><item><title>The truth about AI and skill retention</title><link>https://engineerscodex.com/ai-skill-retention/</link><guid isPermaLink="true">https://engineerscodex.com/ai-skill-retention/</guid><description>A randomized trial found that developers using AI assistance scored 17% lower on a skills test without gaining any speed advantage. The finding matters, but the study design limits how far you can take it.</description><pubDate>Thu, 19 Feb 2026 00:00:00 GMT</pubDate><category>ai</category><category>ai</category><category>research</category><category>learning</category><category>productivity</category></item><item><title>New Study: Businesses Are Replacing Freelancers with AI at a 97% Cost Savings</title><link>https://engineerscodex.com/ramp-ai-replacing-freelancers/</link><guid isPermaLink="true">https://engineerscodex.com/ramp-ai-replacing-freelancers/</guid><description>A Ramp study using real firm-level spending data finds businesses are rapidly substituting freelancers for AI — with the heaviest spenders seeing $1 of AI replace $33 of freelance labor.</description><pubDate>Wed, 18 Feb 2026 00:00:00 GMT</pubDate><category>ai</category><category>ai</category><category>labor-market</category><category>research</category><category>freelancing</category></item><item><title>AI-Generated Agent Skills Are Pointless</title><link>https://engineerscodex.com/skillsbench-ai-generated-skills/</link><guid isPermaLink="true">https://engineerscodex.com/skillsbench-ai-generated-skills/</guid><description>SkillsBench tests whether structured knowledge packages improve LLM agents across 84 tasks. Curated Skills add 16pp. Self-generated Skills add nothing, or make things worse.</description><pubDate>Wed, 18 Feb 2026 00:00:00 GMT</pubDate><category>ai</category><category>agents</category><category>benchmarks</category><category>research</category><category>llm</category></item><item><title>Your Agents.md Might Be Making AI Worse</title><link>https://engineerscodex.com/agents-md-making-ai-worse/</link><guid isPermaLink="true">https://engineerscodex.com/agents-md-making-ai-worse/</guid><description>An ETH Zurich study tests whether AGENTS.md and CLAUDE.md files actually help coding agents. LLM-generated context files reduce success rates while adding 20%+ to costs. Human-written ones barely help.</description><pubDate>Wed, 18 Feb 2026 00:00:00 GMT</pubDate><category>devtools</category><category>devtools</category><category>benchmarks</category><category>research</category><category>coding-agents</category></item><item><title>An LLM Benchmark Idea: Earnings Forecasting</title><link>https://engineerscodex.com/llm-benchmark-earnings-forecasting/</link><guid isPermaLink="true">https://engineerscodex.com/llm-benchmark-earnings-forecasting/</guid><description>A proposed LLM benchmark: feed a model pre-earnings data, have it forecast the results, compare to actual. Here&apos;s why it&apos;s worth building — and why today&apos;s models make it more interesting than ever.</description><pubDate>Wed, 18 Feb 2026 00:00:00 GMT</pubDate><category>ai</category><category>ai</category><category>benchmarks</category><category>research</category><category>finance</category></item><item><title>Anthropic&apos;s Confusing Claude Subscription Policy, Explained</title><link>https://engineerscodex.com/anthropic-claude-subscription-switcharoo/</link><guid isPermaLink="true">https://engineerscodex.com/anthropic-claude-subscription-switcharoo/</guid><description>Anthropic updated its Claude Code docs to ban OAuth tokens from being used in third-party tools. The community exploded. Then Anthropic said nothing was changing.</description><pubDate>Wed, 18 Feb 2026 00:00:00 GMT</pubDate><category>ai</category><category>anthropic</category><category>claude</category><category>policy</category><category>claude-code</category></item></channel></rss>