PLUS: Elon Musk's new AI handset, Meta's big cloud move, and a fix for GPU cold starts

Happy reading

Anthropic's Fable 5 is back online after a temporary shutdown by the US government. The model has been redeployed globally with new safety classifiers following discussions with regulators over national security concerns.

The event establishes a potential playbook for how AI labs and government agencies collaborate on frontier model releases. But does this signal a new era of required government oversight for all powerful AI systems?

In today’s Next in AI:

  • Anthropic’s Fable 5 model is back online

  • Elon Musk’s rumored AI handset prototype

  • Meta enters the AI cloud computing market

  • A major fix for slow GPU cold starts

Fable 5 is Back

Next in AI: Anthropic is bringing its Fable 5 model back online globally after productive talks with the US government. The redeployment includes new safety classifiers designed to prevent misuse for cybersecurity tasks.

Explained:

  • The model was temporarily blocked by a US Department of Commerce export control directive over security concerns, which have now been addressed.

  • To prevent misuse, the new version includes specialized safety classifiers that will block certain cybersecurity-related prompts, temporarily falling back to Opus 4.8 for some coding and debugging tasks.

  • Anthropic is also spearheading a new consensus framework with partners like Amazon, Microsoft, and Google to standardize responses to AI jailbreaks.

Why It Matters: This event highlights the growing tension and necessary collaboration between AI developers and government regulators. It establishes a potential playbook for how future frontier models will be deployed amidst national security concerns.

Musk's AI Handset

Next in AI: SpaceX has secretly shown investors a prototype for an AI-powered handset, according to a new report. The device is designed to reshape how people interact with AI, moving beyond current smartphone capabilities.

Explained:

  • The prototype is described as a sleek device, slimmer than an iPhone, that runs a proprietary operating system and deeply integrates technology from xAI.

  • It would reportedly use a Qualcomm Snapdragon chipset, signaling a focus on high-performance, on-device AI processing.

  • The story took a sharp turn with Musk’s swift denial, who called the well-sourced report “utterly false” in a post on X.

Why It Matters: If true, the device signals a major new player entering the dedicated AI hardware race. Regardless of the rumor's accuracy, it highlights the industry's intense focus on creating personal devices built from the ground up for artificial intelligence.

Meta's Cloud Play

Next in AI: Meta is jumping into the cloud computing market by selling its excess AI processing power. The move aims to generate revenue from its massive infrastructure investments and directly challenges established industry giants.

Explained:

  • To offset its massive AI spending, Meta plans to generate revenue on any computing capacity it’s not using.

  • The new business throws Meta into a fiercely competitive market, with competitors like CoreWeave and Nebius Group seeing their stocks drop over 12% on the news.

  • The company is still deciding whether to offer access to its hosted AI models or sell raw computing power directly to customers.

Why It Matters: This move could create a significant new source of AI compute, potentially increasing supply and lowering costs for developers. It also marks a strategic shift for Big Tech, turning massive internal infrastructure into a direct, revenue-generating product.

The 2-Second Cold Start

Next in AI: Serverless AI provider Cerebrium unveiled a memory snapshotting technique that slashes GPU cold start times by over 80%. This allows fully-warmed AI models to restore and become ready for inference in seconds, tackling a major bottleneck in scaling production AI.

Explained:

  • Instead of rebuilding a model's environment on every startup, the new feature captures a snapshot of the fully initialized container—including CPU memory, GPU memory, and compiled kernels.

  • The system is built on a customized version of Google's gVisor sandbox, which intercepts the container startup process to restore the saved state directly into memory.

  • Benchmarks show the technique reduces cold starts by an average of 71%, with some vLLM workloads starting up to 94% faster than competitors' cached solutions, with the full benchmarks available.

Why It Matters: This approach allows companies to scale AI services more aggressively and cost-effectively, reducing the need to keep expensive GPUs warm just to avoid slow startup times. For developers, it means faster deployments and a better user experience, as applications can respond to demand spikes almost instantly.

The Shortlist

Cloudflare launched its new Attribution Business Insights dashboard, designed to help publishers track AI crawler activity, analyze crawl-to-referral ratios, and distinguish between different crawler intents.

Palantir published an AI sovereignty manifesto as CEO Alex Karp criticized the token-based business models of labs like OpenAI, arguing enterprises should own their models and data stack.

Meta capped internal employee spending on third-party AI tools after token consumption surged to a pace that would cost billions in 2026, highlighting the rising operational costs of AI adoption.

AnalystAIPack released an open-source library of 118 agent skills designed to give AI agents the working knowledge of a malware analyst for reverse engineering and threat hunting.

Keep Reading