Apple released a comprehensive technical report detailing how it built its latest artificial intelligence models, offering rare insights into the company’s approach to competing in the increasingly crowded AI landscape. The 2025 Apple Intelligence Foundation Language Models Tech Report reveals significant architectural innovations and training improvements that could help close the gap with competitors like OpenAI and Google.
Apple Intelligence, the company’s suite of AI-powered features launched in 2024, has faced criticism for limited language support and perceived lag behind rivals. However, this technical deep-dive demonstrates Apple’s continued investment in both on-device processing and cloud-based AI capabilities, with particular emphasis on privacy-conscious design choices that differentiate it from competitors.
The report covers everything from model architecture to data sourcing, revealing four particularly noteworthy developments that signal Apple’s evolving AI strategy.
Apple’s on-device AI model, which contains approximately 3 billion parameters (the mathematical weights that determine how an AI system processes information), uses an innovative two-block architecture that significantly improves performance on memory-constrained devices like iPhones and iPads.
The first block contains 62.5% of the model’s transformer layers—the core computational units that process language—while the second block contains the remaining 37.5% but with key technical components removed to reduce memory requirements. Think of it like having a high-performance engine where certain parts can be streamlined without sacrificing overall power.
This architectural split delivers measurable performance gains. The model requires 37.5% less memory for temporary data storage, and the time needed to generate the first word or phrase of a response dropped by approximately 37.5%. These improvements are crucial for maintaining responsive AI features on devices with limited processing power compared to cloud-based systems.
The design reflects Apple’s broader philosophy of bringing sophisticated AI capabilities directly to user devices rather than relying entirely on cloud processing, which offers privacy benefits but requires more efficient use of hardware resources.
For tasks requiring more computational power, Apple developed an entirely new cloud-based architecture called Parallel-Track Mixture-of-Experts (PT-MoE), designed specifically for its Private Cloud Compute platform—Apple’s privacy-focused server infrastructure.
Traditional AI models process information sequentially, like reading a book page by page. Apple’s innovation splits processing across multiple parallel tracks that work simultaneously, similar to having several readers tackle different sections of the same document at once, then combining their insights.
The system also employs “mixture of experts” technology, where specialized sub-networks activate only when needed. If you ask about cooking, only culinary-focused components engage, while others remain dormant. This modular approach reduces processing bottlenecks and improves response speed without sacrificing accuracy.
Each parallel track contains its own specialized experts, preventing the coordination delays that typically occur when multiple components must synchronize across an entire system. The result is a more efficient architecture that can scale up processing power while maintaining responsiveness.
Apple significantly enhanced language support beyond English, addressing one of the most common criticisms of Apple Intelligence’s initial rollout. The company increased multilingual training data from 8% to 30% of the total dataset—a 275% expansion that includes both naturally occurring content and artificially generated examples.
The model’s vocabulary also expanded by 50%, growing from 100,000 to 150,000 distinct tokens (the basic units of language that AI systems recognize and process). This expansion helps the system better understand linguistic nuances across different languages and cultural contexts.
Crucially, Apple tested these improvements using prompts written by native speakers rather than translations, ensuring the AI performs well with authentic language use rather than artificially translated text. The company evaluated both accuracy and how natural responses sound in local contexts—important factors for user adoption in international markets.
These enhancements should improve the reliability of features like Writing Tools across supported languages, potentially expanding Apple Intelligence’s global appeal and competitive position in non-English-speaking markets.
Apple’s training data strategy demonstrates its commitment to user privacy while building competitive AI capabilities. The company sources information from four primary categories, each reflecting different aspects of its approach to responsible AI development.
Web crawling represents the largest portion of training data, collected through Applebot, Apple’s web crawler that respects robots.txt files—technical instructions that websites use to control automated access. This means if publishers don’t want Apple using their content, they can opt out, unlike some competitors who scrape web content regardless of such preferences.
Licensed content from established publishers provides high-quality training material through negotiated agreements. While Apple doesn’t specify exact partnerships, previous reports indicated discussions with Condé Nast (publisher of The New Yorker, Vogue, and WIRED), NBC News, and IAC (People Magazine, The Daily Beast).
Synthetic data—artificially generated examples created by smaller AI models—plays a particularly important role in specialized areas like mathematics, coding, and multilingual support. This isn’t simply “made-up” information but carefully constructed examples that help AI systems learn specific skills and patterns.
Visual training data includes over 10 billion image-caption pairs, incorporating screenshots with text recognition and handwritten notes. Apple also used its own models to generate more detailed descriptions of images, improving the system’s ability to understand and describe visual content.
Apple’s technical innovations address several competitive challenges while maintaining its privacy-focused positioning. The architectural improvements enable more sophisticated on-device processing, reducing reliance on cloud services that competitors use extensively. This approach aligns with Apple’s broader privacy messaging while potentially reducing operational costs associated with cloud-based AI processing.
The multilingual expansion directly addresses market limitations that have hindered international adoption of Apple Intelligence. By significantly increasing non-English capabilities, Apple positions itself to compete more effectively in global markets where competitors like Google have established advantages.
However, these technical advances must still translate into user-facing improvements that can differentiate Apple’s AI offerings in an increasingly crowded market. The success of these innovations will ultimately depend on whether they enable compelling features that users value over alternatives from OpenAI, Google, and other AI providers.
The privacy-conscious data sourcing approach may also prove advantageous as regulatory scrutiny of AI training practices intensifies globally. Apple’s opt-out respect and licensing agreements could provide competitive advantages if regulators impose stricter requirements on how companies collect training data.