PLUS: A breakthrough 224x model compression method and the covert race for banned Nvidia chips

Good morning

AI computation is no longer bound to Earth. Nvidia-backed startup Starcloud successfully ran a Google AI model on a satellite, marking a significant first for high-performance computing in orbit.

The achievement kicks off a new commercial race for orbital data centers, aimed at solving Earth’s growing energy crisis from AI workloads. As this new frontier opens, what challenges will arise in building and maintaining a computing infrastructure entirely off-planet?

In today’s Next in AI recap:

Starcloud runs Google's Gemma AI in space
A breakthrough 224x model compression method
The covert race for banned Nvidia chips
Wells Fargo to reduce workforce with AI

AI's Final Frontier

Next in AI: Nvidia-backed startup Starcloud has successfully trained and run a Google AI model on a satellite in orbit, proving that high-performance computing is possible in space. This event kicks off a new race for orbital data centers that could solve Earth’s growing energy and infrastructure crisis.

Decoded:

The core motivation is to move energy-intensive AI workloads off-planet, bypassing the massive electricity and water consumption that causes a growing strain on power grids on Earth.
Starcloud equipped its satellite with an Nvidia H100 GPU, a chip 100 times more powerful than any GPU previously sent to space, enabling it to run Google's open-source Gemma model.
A new commercial space race is heating up, with competitors like Google's Project Suncatcher and Aetherflux also planning to deploy their own orbital data centers.

Why It Matters: This successful test marks a major step toward making space-based computing a commercial reality. Orbital data centers could unlock the next generation of AI development by tapping into the limitless solar energy of space.

Beyond Transformers

Next in AI: A new method introduces a stunning breakthrough, compressing a Llama 70B model by 224x while simultaneously improving its accuracy. This research could pave the way for a "post-transformer" era of highly efficient AI inference.

Decoded:

The technique works by extracting a model's core understanding into a compact "meaning field," making the original transformer unnecessary for inference.
This lightweight compression achieves a 224x size reduction and delivers an average +1.81 percentage point accuracy gain on classification tasks.
The paper introduces Field Processing Units (FPUs), a new computing approach that could replace intensive matrix multiplication with faster, shallower operations.

Why It Matters: This development could dramatically lower the immense computational costs required to run top-tier AI models. It opens the door for powerful AI to run effectively on smaller, more accessible hardware like edge devices and consumer electronics.

The AI Chip Smugglers

Next in AI: Chinese AI startup DeepSeek is reportedly using banned Nvidia Blackwell chips to build its next-generation models. The move highlights the intense, covert race for top-tier AI hardware amid geopolitical restrictions.

Decoded:

The advanced chips were reportedly smuggled by installing them in data centers in other countries, then dismantling the servers and shipping the parts to China.
This shadow market exists because of strict US export bans on advanced semiconductors, prompting a cat-and-mouse game of enforcement and evasion.
DeepSeek was already well-positioned for a compute shortage, as its founder had amassed over 10,000 Nvidia GPUs in 2021 before the toughest restrictions were put in place.

Why It Matters: This situation shows the extreme measures companies will take to acquire critical AI hardware, making export controls incredibly difficult to enforce. Access to the most powerful chips is proving to be the key differentiator in the global race for AI leadership.

AI Comes for Wall Street

Next in AI: Wells Fargo's CEO, Charlie Scharf, confirmed the bank expects to have fewer employees next year, explicitly citing AI as a tool to drive significant efficiency improvements and further workforce reductions.

Decoded:

The bank's workforce has already shrunk by about 24% since Scharf took over in 2019, indicating this is an acceleration of an existing trend.
Generative AI tools have already increased code-writing efficiency by 30-35% inside the bank, showing tangible productivity gains from the technology.
Other financial leaders are following suit, with Truist's CEO calling AI a “propellant” for both efficiency and revenue, signaling a broad, strategic pivot across the sector.

Why It Matters: This move signals a fundamental reshaping of how a massive financial institution operates with AI at its core. Professionals across industries should watch closely, as this provides a blueprint for how AI will be used to automate complex white-collar jobs.

AI Pulse

OpenAI warns that its upcoming AI models are likely to pose a "high" cybersecurity risk as their capabilities for autonomous operation and carrying out brute force attacks increase.

Amazon committed to investing over $35 billion in India’s cloud and AI space by 2030 to accelerate AI-driven digitization, export growth, and job creation in the country.

The Pentagon rolled out GenAI.mil, a custom interface for Google's Gemini chatbot designed to give military personnel AI tools for research, document formatting, and video analysis.

Sword Health released MindEval, an open-source framework designed to evaluate the clinical competence and safety of LLMs in realistic, multi-turn mental health conversations.

Starcloud runs Google's Gemma AI model from space

AI's Final Frontier

Beyond Transformers

The AI Chip Smugglers

AI Comes for Wall Street

AI Pulse

Keep Reading

Next in AI

Next in AI