News Nug

Business

The Batch · 1d ago · 7 · tool tutorial inference open source

SGLang is a framework for efficient inference optimization that supports both text and image generation workloads. This course provides practical training on deploying and optimizing models, which is directly relevant for engineers looking to improve inference performance and reduce latency in production AI applications.

ML Research

The Batch · 1d ago · 7 · tool inference tutorial

SGLang is a framework for efficient inference optimization that handles both text and image generation workloads. This course provides practical training on reducing inference latency and computational costs, valuable for engineers deploying language and multimodal models in production.

Data Points

The Batch · 1d ago · 7 · tool tutorial inference open source

SGLang is an open-source framework for efficient inference that supports both text and image generation with optimized serving capabilities. This course provides practical guidance on using SGLang to accelerate model inference, which is directly applicable for engineers building production AI systems.

Andrew's Letter

The Batch · 1d ago · 7 · tool inference tutorial

SGLang is a framework for efficient inference optimization in both text and image generation tasks. The course covers practical techniques for reducing latency and resource consumption in LLM deployments, directly applicable to production AI systems.

AI Newsletter

The Batch · 1d ago · 7 · tool tutorial inference

New course on SGLang covering efficient inference techniques for both text and image generation. SGLang is a practical tool for optimizing LLM inference performance, making this relevant for engineers building production AI applications.

Gemma 4 audio with MLX

Simon Willison · 2d ago · 7 · tutorial inference open source tool

Practical walkthrough of running local audio transcription using Gemma 4 E2B model with MLX framework on macOS via uv run. Demonstrates real-world inference with a 10GB model and shows actual transcription output with accuracy notes, useful for developers building local AI audio pipelines.

Using custom GPTs

OpenAI Blog · 5d ago · 7 · tutorial workflow prompt engineering

Practical guide on building custom GPTs for workflow automation and maintaining consistent outputs through purpose-built AI assistants. Covers the technical process of creating and deploying specialized GPT configurations for specific use cases.

Writing with ChatGPT

OpenAI Blog · 5d ago · 5 · prompt engineering workflow tutorial

A guide on using ChatGPT as a writing assistant for content development through drafting, revision, and refinement workflows. While practical for daily writing tasks, it covers general LLM usage patterns rather than novel technical insights or advanced engineering techniques.

ChatGPT for research

OpenAI Blog · 5d ago · 6 · tutorial workflow prompt engineering

A tutorial on leveraging ChatGPT as a research assistant for source gathering, information analysis, and citation management. Covers practical workflows for using LLMs to structure research tasks, though the specific techniques may be familiar to those already working with prompt engineering and RAG patterns.

Analyzing data with ChatGPT

OpenAI Blog · 5d ago · 6 · tutorial workflow

A practical guide on using ChatGPT for data analysis workflows, covering dataset exploration, insight generation, and visualization creation. While useful for engineers integrating AI into analytics pipelines, it's general-purpose instruction rather than a new tool or technical breakthrough.

Research with ChatGPT

OpenAI Blog · 5d ago · 6 · tutorial workflow prompt engineering

Guide on leveraging ChatGPT's search and deep research capabilities to find current information, evaluate source credibility, and organize findings into structured outputs. Practical for engineers building research-heavy applications or integrating search features into AI workflows.

Creating images with ChatGPT

OpenAI Blog · 5d ago · 6 · prompt engineering workflow tutorial

Guide on using ChatGPT's image generation capabilities (DALL-E integration) with practical techniques for prompt engineering and iterative refinement. Covers workflow for creating visuals through the ChatGPT interface, useful for engineers building AI applications that need visual generation features.

Using projects in ChatGPT

OpenAI Blog · 5d ago · 6 · workflow tutorial

ChatGPT's Projects feature enables organizing related conversations, files, and custom instructions in a single workspace, improving workflow management and team collaboration. This is useful for engineers managing multiple AI-assisted tasks, though it's primarily a UI/UX feature rather than a technical capability advancement.

Multimodal Embedding & Reranker Models with Sentence Transformers

HuggingFace Blog · 6d ago · 8 · tutorial rag library inference

Practical guide to multimodal embedding and reranker models that extend traditional RAG pipelines to handle text, images, and other modalities in a shared embedding space. Covers model loading, encoding mixed-modality inputs, and computing cross-modal similarities with concrete code examples and performance considerations.

Components of A Coding Agent

Ahead of AI · 11d ago · 8 · agent workflow tutorial

Comprehensive reference on coding agent architecture covering six main building blocks of agentic systems (tool use, context management, memory, prompt caching, etc.) and how they differ from raw LLMs and reasoning models. Explains why systems like Claude Code outperform standalone models through their surrounding harness design rather than model capability alone.

claude-code-book — 《御舆：解码 Agent Harness》42万字拆解 AI Agent 的Harness骨架与神经 —— Claude Code 架构深度剖析，15 章从对话循环到构建你自己的 Agent Harness。在线阅读网站：

GitHub Trending AI · 15d ago · 7 · agent architecture tutorial workflow

A comprehensive Chinese technical guide ("御舆") that deconstructs AI Agent architecture, specifically analyzing Claude Code's design patterns including conversation loops, tool permission pipelines, context compression, and the Agent Harness runtime framework. Provides a transferable mental model for building production-grade agent systems across different frameworks without relying on prompt engineering tutorials.

how-claude-code-works — Deep dive into Claude Code internals — architecture, agent loop, context engineering, and more. / 深入解析 Claude Code 源码：架构、Agent 循环、上下文工程、工具系统等

GitHub Trending AI · 15d ago · 9 · agent architecture tutorial open source workflow

In-depth technical analysis of Claude Code's source architecture, covering the agent loop, context engineering, tool system, and production-grade error recovery strategies. Includes a companion project (Claude Code From Scratch) with ~4000 lines of TypeScript/Python and 11-chapter tutorial for building your own AI programming agent from scratch.

A Visual Guide to Attention Variants in Modern LLMs

Ahead of AI · 24d ago · 8 · research tutorial open source

Comprehensive reference guide organizing 45+ LLM architectures with visual model cards and detailed explanations of attention variants (MHA, GQA, sliding window, etc.) used in modern models. Includes both a web gallery and printable poster, serving as a practical learning resource for understanding contemporary transformer architectures.

ai-engineering-from-scratch — Learn it. Build it. Ship it for others.

GitHub Trending AI · 28d ago · 7 · tutorial workflow agent open source

A comprehensive AI engineering curriculum spanning 260+ lessons across 20 phases (~290 hours) covering fundamentals from linear algebra to autonomous agent swarms in Python, TypeScript, Rust, and Julia. Each lesson produces reusable artifacts (prompts, skills, agents, MCP servers) that can be immediately integrated into AI coding workflows, with personalized learning paths based on existing ML/DL knowledge.

Categories of Inference-Time Scaling for Improved LLM Reasoning

Ahead of AI · 81d ago · 8 · inference prompt engineering tutorial research

Comprehensive overview of inference-time scaling techniques for LLMs, covering methods like chain-of-thought prompting, self-consistency, best-of-N ranking, and rejection sampling with verifiers. The author shares practical experimentation results (achieving 15% to 52% accuracy improvement) and categorizes approaches from both academic literature and proprietary LLM implementations, making it directly applicable to deployed systems.