DeepMind Blog · 3h ago · 8 · new model api update inference

Gemini 3.1 Flash TTS, Google's latest text-to-speech model, introduces granular audio tags for precise vocal control across 70+ languages with improved naturalness (Elo score 1,211 on benchmarks). Developers can now embed natural language commands directly in text to control style, pacing, and delivery, with all audio watermarked using SynthID, available in Google AI Studio, Vertex AI, and Google Vids.

OpenAI Blog · 9h ago · 8 · tool agent api update deployment

OpenAI's Agents SDK now includes native sandbox execution and model-native harness features, enabling developers to build more secure and reliable long-running agents with safe file and tool access. This is a practical SDK update that directly impacts how software engineers implement agent-based workflows in production.

Simon Willison · 22h ago · 6 · new model fine tuning api update

OpenAI released GPT-5.4-Cyber, a fine-tuned variant optimized for defensive cybersecurity use cases, along with a Trusted Access for Cyber program using identity verification for reduced-friction access. The announcement emphasizes OpenAI's existing cybersecurity work and self-service verification, though premium tools still require application approval similar to competing offerings.

Anthropic Blog · 1d ago · 10 · new model api update agent inference benchmark

Claude Opus 4.6 releases with major improvements for AI engineers: 1M token context window in beta, enhanced agentic task capabilities, state-of-the-art coding performance on Terminal-Bench 2.0, and new developer features including adaptive thinking, context compaction, and effort controls for managing cost/intelligence tradeoffs. Available immediately on API at same pricing ($5/$25 per million tokens) with new product integrations like Claude Code agent teams and PowerPoint support.

Anthropic Blog · 1d ago · 9 · new model api update inference

Claude Sonnet 4.6 is now available with significantly improved coding, reasoning, and computer-use capabilities (including 1M token context window in beta), matching or exceeding Opus 4.5 performance while maintaining Sonnet's pricing. The model shows major improvements in consistency, instruction following, and real-world task automation—particularly for computer vision/interaction tasks across legacy software without APIs.

DeepMind Blog · 2d ago · 9 · new model api update agent benchmark

Google released Gemini Robotics-ER 1.6, a specialized embodied reasoning model for robotic systems with enhanced spatial understanding, multi-view reasoning, and new instrument-reading capabilities like gauge interpretation. The model is now available via the Gemini API with improvements in pointing, counting, task planning, and success detection—critical for physical agent autonomy.

Simon Willison · 5d ago · 6 · api update benchmark workflow

Technical analysis of OpenAI's capability gap between voice mode (GPT-4o era, April 2024 cutoff) and advanced reasoning models, highlighting how different access points reveal disparate model capabilities. References Andrej Karpathy's observation on the disconnect between consumer-facing voice interfaces versus specialized paid models excelling at code analysis and complex reasoning tasks.

OpenAI Blog · 5d ago · 5 · api update workflow

General overview of OpenAI's existing product portfolio (ChatGPT, Codex, APIs) and their applications across work and development contexts. While relevant to AI engineers, this reads as introductory content without specific technical updates, new capabilities, or implementation guidance.

Simon Willison · 6d ago · 8 · new model api update agent tool benchmark

Meta released Muse Spark, a new hosted AI model with Instant and Thinking modes, accessible via meta.ai with a private API preview. The model includes integrated tools for web search, image generation, code execution, and Meta content search, making it relevant for understanding multi-tool agent systems and comparing reasoning capabilities against current SOTA models like GPT-5.4 and Gemini 3.1.

Latent Space · 12d ago · 9 · new model open source agent inference api update

Google DeepMind released Gemma 4, a family of open-weight models (31B dense, 26B MoE, edge variants) under Apache 2.0 license with native multimodal support (text/image/video/audio), 256K context, and function calling—positioning it as a top-tier open model for reasoning, agents, and edge deployment. The 31B variant achieves competitive performance with significantly fewer parameters than rivals, with strong benchmarks on GPQA and AIME, and rapid ecosystem adoption already underway.

GitHub Trending AI · 14d ago · 6 · tool agent open source api update

Vibe-Trading is an open-source multi-agent finance workspace that converts natural language into trading strategies and market analysis, with support for multiple LLM providers and free data sources. It offers API, CLI, and MCP plugin interfaces for integration into AI agent workflows, with backtesting capabilities and multi-platform export for TradingView/MT5.

HuggingFace Blog · 14d ago · 8 · tool workflow api update deployment

gradio.Server enables building custom frontends (React, Svelte, vanilla JS) while leveraging Gradio's backend infrastructure including queuing, concurrency management, ZeroGPU support, and gradio_client compatibility. The approach extends FastAPI to provide both traditional Gradio UI components and full custom frontend flexibility with the same backend power.

DeepMind Blog · 20d ago · 8 · new model api update agent

Google released Gemini 3.1 Flash Live, an improved real-time audio model with better precision, lower latency, and enhanced tonal understanding for voice-first applications. Available via Gemini Live API, it achieves 90.8% on ComplexFuncBench Audio and 36.1% on Scale AI's Audio MultiChallenge, enabling developers to build voice agents that handle complex tasks with natural dialogue in noisy environments.

GitHub Trending AI · 20d ago · 8 · tool open source agent api update

An open-source MCP (Model Context Protocol) server that connects AI agents (Claude, GPT, Copilot) to 41 Brazilian government APIs covering economics, legislation, transparency, judiciary, elections, and more—38 APIs require no authentication. This is a practical tool for engineers building AI applications that need access to structured public sector data with ready-made integrations and natural language query capabilities.

DeepMind Blog · 21d ago · 7 · new model api update tool

Google released Lyria 3 Pro, an advanced music generation model supporting 3-minute tracks with structural awareness (verses, choruses, bridges). The model is available across multiple platforms including Vertex AI, Gemini API, Google AI Studio, and consumer apps, enabling developers to integrate custom music generation at scale.

OpenAI Research · 21d ago · 6 · workflow api update

OpenAI published a Model Spec that documents expected behavior, safety constraints, and design principles for their AI models. This provides engineers with official guidance on model capabilities and limitations, useful for understanding how to work within OpenAI's systems and for designing similar frameworks in their own applications.

GitHub Trending AI · 22d ago · 8 · tool open source api update inference deployment

apfel is an open-source tool that exposes Apple's on-device foundation model through a CLI, OpenAI-compatible API server, and shell integration—enabling local LLM inference on Apple Silicon Macs with no cloud dependency, API keys, or per-token billing. It supports tool calling via Model Context Protocol (MCP), includes demo shell scripts for practical workflows, and manages a 4096-token context window automatically.

GitHub Trending AI · 25d ago · 7 · api update tool inference

A curated resource listing LLM APIs with permanent free tiers for text inference, including first-party APIs from model trainers and third-party platforms hosting open-weight models. Covers rate limits, available regions, and notable models—useful reference for engineers exploring cost-free inference options during development and experimentation.

DeepMind Blog · 43d ago · 9 · new model api update inference

Google released Gemini 3.1 Flash-Lite, a new lightweight model optimized for high-volume production workloads at $0.25/1M input tokens and $1.50/1M output tokens. It delivers 2.5X faster time-to-first-token and 45% faster output speeds than 2.5 Flash while maintaining quality, making it ideal for real-time applications like translation, content moderation, UI generation, and agentic workflows at scale.

DeepMind Blog · 48d ago · 7 · new model api update inference

Google DeepMind released Nano Banana 2 (Gemini 3.1 Flash Image), a new image generation model combining advanced reasoning and world knowledge with Flash-speed inference. The model is now available across Google products (Gemini app, Search) and offers improved subject consistency, photorealism, and instruction-following capabilities with reduced latency compared to the Pro version.