Luis Poveda's AI Newsletter
Posts
Luis Poveda's AI Newsletter: May 5, 2025

Luis Poveda's AI Newsletter: May 5, 2025

Models Get Smaller & Smarter, APIs Open Up, and AI Writes the Code

Luis Poveda
May 05, 2025

In partnership with

Find out why 1M+ professionals read Superhuman AI daily.

In 2 years you will be working for AI

Or an AI will be working for you

Here's how you can future-proof yourself:

Join the Superhuman AI newsletter – read by 1M+ people at top companies
Master AI tools, tutorials, and news in just 3 minutes a day
Become 10X more productive using AI

Join 1,000,000+ pros at companies like Google, Meta, and Amazon that are using AI to get ahead.

Executive Summary

This week marked significant AI advancements, from Microsoft's powerful yet compact Phi-4 reasoning models to Meta opening its Llama API, intensifying platform competition. Open-source momentum continued with Alibaba's multilingual Qwen3, Xiaomi's debut MiMo model, and DeepSeek's specialized Prover V2 for math. AI safety was also prominent, highlighted by Meta's new LlamaFirewall tool unveiled at LlamaCon. Simultaneously, AI's integration into work and learning deepened: Microsoft's CEO noted AI now writes nearly 30% of their code, Duolingo used AI to launch 148 new language courses, FutureHouse introduced AI agents for scientific research, and Google enhanced NotebookLM with broader language support.

Major Model Releases & Platforms

Microsoft's Phi-4 Models Pack Powerful Reasoning into Small Packages

Microsoft released three new open-weight Phi-4 small language models (SLMs): the 14B parameter Phi-4-reasoning, its reinforcement-learning-tuned variant Phi-4-reasoning-plus, and the compact 3.8B parameter Phi-4-mini-reasoning. These models introduce advanced reasoning or "thinking" capabilities, enabling them to efficiently break down and solve complex queries despite their smaller size, with benchmarks showing performance rivaling or exceeding larger models on complex math and science tasks. The Phi-4-mini is specifically optimized for on-device use, particularly in education, having been trained on over a million diverse math problems using web data, curated OpenAI demos, and synthetic data from Deepseek-R1.

Microsoft

"Phi-reasoning models introduce a new category of small language models... small enough for low-latency environments yet maintain strong reasoning capabilities that rival much bigger models".

Microsoft Team Blog Post

Meta Challenges OpenAI with New Llama API for Developers

At its first LlamaCon event on April 29, 2025, Meta launched the Llama API, granting developers access to its AI models, currently available in a limited free preview. This API supports recent models like Llama 4 Scout and Maverick and allows fine-tuning on the new Llama 3.3 8B model. Meta highlighted developer flexibility, ensuring users retain full control over models and weights without data retention for training, positioning it as an open alternative to closed ecosystems like OpenAI's. The API aims to accelerate Llama adoption by including conveniences such as interactive model playgrounds and SDKs for Python and Typescript, facilitating the creation of custom AI applications.

Alibaba Unveils Qwen3: Open Source AI with Hybrid Reasoning and Global Reach

Alibaba launched Qwen3 on April 30, 2025, the newest generation of its open-source LLM family, comprising six dense models (0.6B to 32B parameters) and two large Mixture-of-Experts (MoE) models (up to 235B parameters). Qwen3 introduces "hybrid reasoning," enabling models to dynamically switch between a complex 'thinking mode' for tasks like math and coding, and a faster 'non-thinking mode' for general responses. Trained on 36 trillion tokens, these models demonstrate significantly advanced multilingual skills across 119 languages and dialects and achieve top-tier results on reasoning, coding, and tool-use benchmarks. All Qwen3 models are now globally available on platforms such as Hugging Face, GitHub, and ModelScope.

Qwen

"Qwen3 marks Alibaba's debut of hybrid reasoning models, combining traditional LLM capabilities with advanced, dynamic reasoning".

Zawya

Zawya: Alibaba introduces Qwen3, setting new benchmark in open-source AI with hybrid reasoning
SambaNova Blog: Qwen3 Is Here - Now Live on SambaNova Cloud

Xiaomi Enters AI Race with Open Source MiMo Model for Smart Devices

Chinese tech giant Xiaomi entered the competitive AI arena recently by unveiling its first open-source LLM, MiMo. This 7-billion-parameter model focuses on advanced reasoning, particularly in math and coding, with initial reports suggesting it outperformed comparable models from OpenAI and Alibaba in some tests. Xiaomi intends to deeply integrate MiMo across its ecosystem of smartphones, EVs, and smart home devices, a move supported by prior reports of significant investments in GPU computing power for AI training.

Xiaomi

"The year 2025 may feel like the second half of the AI model race, but we strongly believe that the road to AGI is still long".

Xiaomi Statement

Tech Edition: Xiaomi enters China's AI race with new model to power smart devices
GuruFocus: Xiaomi (1810) Unveils MiMo AI Model, Shares Jump Over 5%

DeepSeek Quietly Updates Math-Focused Prover V2 Model

Chinese AI startup DeepSeek quietly released Prover-V2 on Hugging Face recently, updating its specialized model series designed for mathematical reasoning and theorem proving. Built upon DeepSeek's powerful V3 model, which employs an efficient 671-billion-parameter Mixture-of-Experts (MoE) architecture, Prover-V2 underscores the industry trend towards developing highly specialized AI for complex domains. Although full details remain scarce, its release timing positions it alongside other recent reasoning-focused models like Qwen3 and Phi-4.

DeepSeek

Prover-V2 is part of the Prover series, focused on solving math-related problems".

Tech in Asia

Tech in Asia: DeepSeek quietly updates open-source model for math proofing
NEWS.am TECH: DeepSeek quietly updates math proof model, Prover-V2

AI Safety & Governance

LlamaCon Highlights: Meta Boosts AI Security with New Tools and Initiatives

During its LlamaCon event on April 29, 2025, Meta unveiled several new tools and initiatives aimed at enhancing the security and responsible use of its open-source Llama models. Key releases included LlamaFirewall, a guardrail tool against risks like prompt injection; Llama Guard 4, an updated safety classifier for text and images; and Prompt Guard 2. Additionally, Meta introduced CyberSecEval 4, a benchmark suite developed with CrowdStrike to evaluate LLM security in cybersecurity contexts, and launched the Llama Defenders Program to provide partners with AI security solutions for threat detection, highlighting the growing focus on safety within the open-source AI community.

AI Transforming Work & Learning

lA Writes the Code: Microsoft CEO Nadella Says AI Generates Nearly 30% of Microsoft's Code

Microsoft CEO Satya Nadella recently revealed that AI tools like GitHub Copilot are now generating nearly 30% of the code within some projects at Microsoft, a figure he noted is "significantly going up." This statement underscores the rapid adoption and substantial impact of AI code generation on developer productivity within major tech firms. Echoing this trend, Google CEO Sundar Pichai shared that AI writes over 25% of new code at Google, while Meta CEO Mark Zuckerberg predicted AI could assist with potentially half of all development work within the next year.

ASSOCIATED PRESS

"I'd say maybe 20%, 30% of the code that is inside of our repos today and some of our projects are probably all written by software [AI]".

Satya Nadella (via Times of India)

Times of India: Satya Nadella says 30% of Microsoft's code is AI-generated
Microsoft Copilot Blog: Release Notes: May 2, 2025 (Context on Copilot updates)

Duolingo Doubles Down on AI, Launches 148 New AI-Generated Language Courses

Language learning platform Duolingo recently announced its largest course expansion ever, adding 148 new courses created using generative AI, effectively doubling its offerings. This expansion makes its seven most popular non-English languages available across all 28 supported interface languages. Enabled by AI, this rapid development contrasts sharply with the years required for previous manual course creation and aligns with Duolingo's "AI-First" strategy, which involves automating content creation and reducing reliance on human contractors. These new courses initially target beginner levels, with more advanced features anticipated later.

Duolingo

"Developing our first 100 courses took about 12 years, and now, in about a year, we're able to create and launch nearly 150 new courses".

Luis von Ahn, CEO Duolingo

FutureHouse Launches Platform with AI Agents to Assist Scientific Discovery

FutureHouse, an Eric Schmidt-backed nonprofit focused on building an "AI scientist," launched its platform and API recently. This platform provides AI agents designed to assist researchers with various scientific tasks, including hypothesis generation, literature review, and experimental design, featuring initial tools like "Crow" for short answers and "Falcon" for deep research. Accessible via the web and an API, the platform aims to accelerate scientific discovery by offering powerful AI tools that can integrate with researchers' existing code and leverage resources like open access literature and clinical trial databases.

"Today, we are launching the first publicly available AI Scientist, via the FutureHouse Platform. Our AI Scientist agents can perform a wide variety of scientific tasks better than humans".

Sam Rodriques, FutureHouse (via X/Twitter)

Techmeme (citing TechCrunch & X/Twitter): Eric Schmidt-backed nonprofit FutureHouse... launches a platform and API...
Morningstar (Related Trend - Causaly): Causaly Announces Agentic AI for Scientific Discovery

Google NotebookLM Breaks Language Barriers with Audio Overviews in 50+ Languages

Google significantly expanded language support for the Audio Overviews feature within its AI research tool, NotebookLM, in a recent announcement. This feature, which transforms notes and source materials into conversational, podcast-style audio summaries, now functions in over 50 languages, including Hindi, Marathi, Spanish, French, and Arabic. Users can select their preferred output language in the settings, and both chat responses and audio summaries will be generated accordingly, leveraging the native audio capabilities of Google's Gemini AI model to make complex information more accessible, particularly in multilingual contexts.

NotebookLM

"This capability breaks down language barriers and makes the information more accessible to everyone".

Google (via 9to5Google)

Business Today: Google expands Audio Overviews in NotebookLM to 50+ languages, including Hindi, Marathi
9to5Google: NotebookLM Audio Overviews coming to 50+ languages

Conclusion

This week underscored the rapid pace of AI development, particularly in model capabilities and accessibility. The rise of powerful small models like Phi-4, the opening of platforms like the Llama API, and major open-source contributions from players like Alibaba signal increasing competition and democratization. Simultaneously, the growing role of AI in practical applications, from writing code at Microsoft to generating language courses at Duolingo and aiding scientific research via FutureHouse, demonstrates its accelerating integration into various aspects of work and knowledge creation. Expect continued focus on reasoning, efficiency, safety, and real-world impact in the weeks ahead.

The Author

Luis Poveda’s AI Newsletter

Luis Poveda is a technology optimist and passionate innovator, constantly exploring and researching the latest trends. Based in Barcelona, he is currently focused on AI and developing a modern AI-driven IT network observability tool. He is also the creator and maintainer of Luis Poveda's AI Newsletter, where he curates and shares key insights on the evolving AI landscape.