PLUS: The verdict on AI copyright, Microsoft's Copilot struggles, and NVIDIA's intelligent document agents

Good morning

Mistral has released a new family of speech-to-text models designed for real-time voice applications. With ultra-low latency and high accuracy, the new release aims to power more natural and responsive AI voice agents.

By making a key model open-source and offering competitive pricing, Mistral is significantly lowering the barrier for developers to build advanced audio tools. Will this move accelerate the development of truly conversational AI agents across the industry?

In today’s Next in AI:

  • Mistral's new real-time audio AI

  • The verdict on AI copyright

  • Microsoft's Copilot struggles

  • NVIDIA's intelligent document agents

Mistral's New Audio AI

Next in AI: Mistral just released Voxtral Transcribe 2, a new family of speech-to-text models designed for ultra-low latency, high accuracy, and real-time voice applications.

Decoded:

  • Voxtral Realtime delivers transcription with a delay as low as sub-200ms, making it ideal for voice agents, and is available as open weights under Apache 2.0 for developers to build with.

  • The batch processing model, Voxtral Mini Transcribe V2, offers industry-leading accuracy at a highly competitive price of $0.003 per minute, along with features like speaker diarization and context biasing.

  • You can test the new models instantly in Mistral Studio's new audio playground, which lets you upload files and experiment with features like diarization and timestamps.

Why It Matters: This release provides developers with a powerful, open-source tool for creating truly responsive voice-first applications. The combination of high performance and low cost significantly lowers the barrier for businesses to integrate advanced audio processing into their workflows.

AI Can't Own Its Art

Next in AI: The U.S. Justice Department is backing the Copyright Office's position that works generated entirely by AI cannot be copyrighted. It has officially asked the Supreme Court to reject a case challenging this long-standing requirement for human authorship.

Decoded:

  • The case stems from computer scientist Stephen Thaler's attempt to copyright an artwork, listing his AI system, the "Creativity Machine," as the sole author.

  • Government lawyers argue that core concepts of the Copyright Act—like an author's lifespan, heirs, and signature—are inherently human and cannot apply to machines.

  • This doesn't block all AI-assisted art; the Copyright Office has registered hundreds of works with AI elements where a human provided sufficient creative contribution.

Why It Matters: This decision solidifies the 'human in the loop' standard for intellectual property for now. As AI systems become more autonomous, the legal line between a tool and a creator will face even greater challenges.

Microsoft's Copilot Stumbles

Next in AI: Microsoft's flagship AI product, Copilot, is reportedly struggling with confusing branding and is losing user preference to rivals like Google's Gemini. A new report from The Wall Street Journal suggests the effort to build a true ChatGPT alternative has been a tough road.

Decoded:

  • Users are reportedly frustrated by confusing brand positioning and interoperability problems across Microsoft's suite of tools.

  • Data shows that only a small portion of Microsoft's enterprise subscribers actively use Copilot, with its popularity declining against competitors in recent months.

  • The struggles come as CEO Satya Nadella aims to move the industry from AI “spectacle” to “substance,” signaling a push to make Copilot a more integral and proven tool.

Why It Matters: These challenges demonstrate the difficulty of translating a powerful AI partnership into a cohesive, user-friendly product. Microsoft's massive enterprise footprint remains its key advantage, offering a powerful distribution channel if it can fix these core issues.

NVIDIA's Document Agents

Next in AI: NVIDIA is showcasing how its open NVIDIA Nemotron models power AI agents that process complex documents, turning unstructured data from PDFs, presentations, and spreadsheets into actionable business intelligence.

Decoded:

  • The system moves beyond simple text scraping to understand rich content like charts, tables, and figures by using retrieval-augmented generation (RAG) techniques.

  • Major companies like Docusign are using the technology to interpret complex contracts at scale, transforming agreements into structured data for intelligent document processing.

  • For developers, NVIDIA provides these capabilities as open models and microservices, available on platforms like Hugging Face and GitHub for building custom RAG pipelines.

Why It Matters: This makes vast amounts of previously locked unstructured data available for AI-driven analysis. Documents are shifting from static archives into living knowledge systems that power real-time business decisions.

AI Pulse

Anthropic announced that its AI assistant Claude will remain ad-free, stating that an advertising-based model would introduce incentives that could work against its goal of being genuinely helpful to users.

Anthropic warned against the US government's decision to allow the sale of advanced Nvidia H200 chips to China, with CEO Dario Amodei calling the move "crazy" and comparing it to selling nuclear weapons.

Qodo released a new benchmark for AI code review tools, claiming its system achieves a superior F1 score by evaluating models on real-world pull requests injected with both functional bugs and best-practice violations.

OpenAI experienced a major outage on Tuesday, causing widespread login issues and service disruptions for ChatGPT users across the globe for several hours.

Keep Reading