r/MachineLearning · 16h ago · 7 · fine tuning open source tool research

Fine-tuned open-source TTS model (Chatterbox) for 8 Indian languages using LoRA adapters (1.4% parameters) and grapheme-level tokenization with Brahmic script warm-start initialization. Achieves sub-0.25 CER for most languages except Malayalam (0.86), demonstrating efficient multilingual adaptation without full model retraining or language-specific G2P pipelines.

r/MachineLearning · 22h ago · 7 · tool inference open source research

LARQL introduces a novel approach to decomposing LLM weight matrices into graph databases, enabling k-NN traversal as a mathematically equivalent alternative to matrix multiplication. This enables in-context knowledge updates without retraining and reduces memory footprint by replacing dense matrices with sparse graph structures, offering practical efficiency gains for model deployment and knowledge management.

The Batch · 1d ago · 7 · tool tutorial inference open source

SGLang is a framework for efficient inference optimization that supports both text and image generation workloads. This course provides practical training on deploying and optimizing models, which is directly relevant for engineers looking to improve inference performance and reduce latency in production AI applications.

The Batch · 1d ago · 7 · tool tutorial inference open source

SGLang is an open-source framework for efficient inference that supports both text and image generation with optimized serving capabilities. This course provides practical guidance on using SGLang to accelerate model inference, which is directly applicable for engineers building production AI systems.

r/MachineLearning · 1d ago · 8 · benchmark agent open source research

ClawBench is a new benchmark evaluating AI browser agents on 153 real-world tasks across live websites, revealing that even the best models (Claude Sonnet, GLM-5) achieve only 33% success rates. The benchmark provides comprehensive evaluation infrastructure with multi-layer behavioral data collection, request interception for safe testing, and an interactive leaderboard—offering practical insights for building and improving web-capable AI agents.

r/LocalLLaMA · 1d ago · 8 · new model open source inference tool

Baidu released ERNIE-Image, an 8B-parameter open-weight text-to-image diffusion model with strong instruction-following and text-rendering capabilities, alongside ERNIE-Image-Turbo optimized for fast inference (8 steps). The model is available via Hugging Face with practical examples for integration into workflows.

r/MachineLearning · 1d ago · 8 · dataset rag research open source benchmark

A software engineer has built a structured 20M+ Indian court case dataset with citation graphs, dense/sparse embeddings, and extracted metadata (judges, parties, sections, acts). The resource includes heuristic + LLM-based NER extraction pipeline, cross-referenced legislation, and serves as a novel evaluation benchmark for legal RAG systems and graph neural networks on low-resource legal domain data.

Latent Space · 1d ago · 6 · open source deployment benchmark

Community survey of popular open-weight models across local deployment use cases, highlighting Qwen 3.5, Gemma 4, DeepSeek V3.2, and others based on actual Reddit recommendations rather than benchmarks. Focuses on practical model selection for engineers building local inference systems, with specific callouts for coding (Qwen3-Coder-Next) and agentic workloads (MiniMax M2.5/M2.7).

r/MachineLearning · 1d ago · 8 · library research open source benchmark deployment

HALO-Loss is an open-source drop-in replacement for Cross-Entropy that uses euclidean distance instead of dot products to bound model confidence, enabling native out-of-distribution detection without sacrificing base accuracy. The method addresses a fundamental neural network problem where models hallucinate on unfamiliar data by mathematically constraining confidence to finite distances and providing an implicit "abstain class" at the origin of the latent space. Testing shows zero accuracy drop, improved calibration (ECE down to 1.5%), and significantly reduced false positives on far OOD detection compared to standard approaches.

r/MachineLearning · 1d ago · 7 · research open source inference benchmark

An indie developer trained a 1B parameter Spiking Neural Network (SNN) from random initialization for language modeling, achieving 93% sparsity and spontaneous cross-lingual emergence, challenging the conventional wisdom that direct SNN training requires ANN conversion or distillation. While early-stage (4.4 loss, 27k steps), this demonstrates a viable pathway for neuromorphic computing and inference efficiency, with code and checkpoint shared for community feedback.

Simon Willison · 2d ago · 7 · tool library open source

Servo browser engine is now available on crates.io as an embeddable library, enabling Rust developers to integrate it into applications. The post demonstrates practical usage including a CLI screenshot tool and explores WebAssembly compilation possibilities, though full Servo WebAssembly compilation isn't feasible due to threading and dependency constraints.

Simon Willison · 2d ago · 7 · tutorial inference open source tool

Practical walkthrough of running local audio transcription using Gemma 4 E2B model with MLX framework on macOS via uv run. Demonstrates real-world inference with a 10GB model and shows actual transcription output with accuracy notes, useful for developers building local AI audio pipelines.

r/LocalLLaMA · 3d ago · 7 · open source inference tool benchmark

This PR adds audio processing support to Gemma 4 models in llama.cpp using a USM-style Conformer encoder, with key fixes for CUDA/Vulkan/Metal backend compatibility. The implementation includes optimizations like replacing unsupported ops (ggml_roll → view+concat) and fixing contiguity issues that caused CPU fallbacks, achieving strong audio transcription results across different quantization levels and backends.

r/LocalLLaMA · 3d ago · 9 · new model open source agent deployment benchmark

MiniMax-M2.7 is a new open-source model with strong programming and agent capabilities, featuring self-evolving optimization during training and native multi-agent collaboration support. The model demonstrates exceptional performance on code tasks (SWE-Pro 56.22%, Terminal Bench 57.0%), system-level reasoning for SRE work, and achieves competitive benchmarks against GPT-5.3 and Claude variants while supporting deployment via SGLang, vLLM, and Transformers.

Simon Willison · 3d ago · 5 · tool open source

SQLite 3.53.0 release includes result formatting improvements via a new Query Results Formatter library, with a WebAssembly playground built using Claude Code. While SQLite is foundational infrastructure, this release focuses on general database improvements rather than AI-specific tooling or capabilities.

HuggingFace Blog · 6d ago · 8 · new model inference open source

Waypoint-1.5 is Overworld's improved real-time video world model now optimized for consumer hardware, running up to 720p/60fps on RTX 3090+ and 360p on broader gaming laptops/Apple Silicon. The model was trained on 100x more data than v1 with more efficient video modeling techniques, prioritizing interactive responsiveness and local deployment over pure visual fidelity.

HuggingFace Blog · 7d ago · 7 · agent workflow open source tool

ALTK-Evolve is a long-term episodic memory system for AI agents that distills interaction traces into reusable guidelines rather than storing raw transcripts, enabling agents to generalize principles across tasks. The framework shows significant improvements on multi-step API tasks (AppWorld benchmark) and integrates as a Claude Code plugin or with existing tools like Arize Phoenix and Codex without major stack changes.

HuggingFace Blog · 7d ago · 7 · tool open source deployment

Safetensors, the secure model weight format that replaced pickle-based serialization, is moving to PyTorch Foundation governance to become truly community-owned while remaining the de facto standard for model distribution across Hugging Face Hub. The move enables vendor-neutral stewardship and potential integration into PyTorch core, with no breaking changes for existing users but clearer paths for community contributors.

Simon Willison · 7d ago · 7 · new model open source benchmark

GLM-5.1, a 754B parameter open-weights model from Z.ai, demonstrates strong capabilities in multimodal generation and instruction-following, particularly for SVG/HTML creation tasks. The model can self-correct technical issues (CSS animations breaking SVG positioning) and generate well-structured code with detailed comments, making it worth testing for creative code generation workflows.

Latent Space · 8d ago · 7 · agent workflow prompt engineering open source

OpenAI's Ryan Lopopolo discusses 'Harness Engineering'—a methodology for building AI-native software where agents operate autonomously with zero human-written code, using >1B tokens/day and extensive prompt engineering via Symphony (a multi-agent orchestration system). The approach shifts focus from prompt optimization to building proper context, structure, and observability for agents to function as full teammates rather than copilots.