News Nug

"I don't know!": Teaching neural networks to abstain with the HALO-Loss. [R]

r/MachineLearning · 1d ago · 8 · library research open source benchmark deployment

HALO-Loss is an open-source drop-in replacement for Cross-Entropy that uses euclidean distance instead of dot products to bound model confidence, enabling native out-of-distribution detection without sacrificing base accuracy. The method addresses a fundamental neural network problem where models hallucinate on unfamiliar data by mathematically constraining confidence to finite distances and providing an implicit "abstain class" at the origin of the latent space. Testing shows zero accuracy drop, improved calibration (ECE down to 1.5%), and significantly reduced false positives on far OOD detection compared to standard approaches.

Exploring the new `servo` crate

Simon Willison · 2d ago · 7 · tool library open source

Servo browser engine is now available on crates.io as an embeddable library, enabling Rust developers to integrate it into applications. The post demonstrates practical usage including a CLI screenshot tool and explores WebAssembly compilation possibilities, though full Servo WebAssembly compilation isn't feasible due to threading and dependency constraints.

Multimodal Embedding & Reranker Models with Sentence Transformers

HuggingFace Blog · 6d ago · 8 · tutorial rag library inference

Practical guide to multimodal embedding and reranker models that extend traditional RAG pipelines to handle text, images, and other modalities in a shared embedding space. Covers model loading, encoding mixed-modality inputs, and computing cross-modal similarities with concrete code examples and performance considerations.

mempalace — The highest-scoring AI memory system ever benchmarked. And it's free.

GitHub Trending AI · 10d ago · 7 · open source tool library

MemPalace is an open-source local AI memory system that stores raw conversation transcripts in ChromaDB without summarization, achieving 96.6% on LongMemEval benchmarks. It organizes conversations hierarchically (wings/halls/rooms) for semantic searchability and includes an experimental AAAK compression dialect for handling repeated entities at scale, though the developers transparently document current limitations (84.2% recall with AAAK vs 96.6% with raw storage).

open-multi-agent — TypeScript multi-agent framework — one runTeam() call from goal to result. Auto task decomposition, parallel execution. 3 dependencies, deploys anywhere Node.js runs.

GitHub Trending AI · 15d ago · 8 · library open source agent tool

open-multi-agent is a lightweight TypeScript multi-agent orchestration framework with minimal dependencies (3 runtime deps) designed for goal-driven agent coordination in Node.js environments. It provides a simpler alternative to LangGraph (declarative graph approach) and CrewAI (Python), with built-in features like structured output, task retry, and human-in-the-loop capabilities.

TRL v1.0: Post-Training Library Built to Move with the Field

HuggingFace Blog · 15d ago · 8 · library fine tuning workflow research

TRL v1.0 introduces architectural lessons for building stable post-training libraries that can adapt as methods evolve from PPO to DPO to RLVR approaches. The library design prioritizes flexibility over fixed abstractions, recognizing that core concepts like reward models shift between being fundamental, optional, or reimagined as verifiers across different training paradigms.

awesome-opensource-ai — Curated list of the best truly open-source AI projects, models, tools, and infrastructure.

GitHub Trending AI · 22d ago · 7 · tool open source library agent rag deployment

A curated directory of production-ready open-source AI tools and libraries organized by category (core frameworks, models, inference, agents, RAG, training, deployment, benchmarks, safety). Highlights practical CLI tools like PR-Agent, Gemini CLI, LLM, and Repomix that directly integrate AI into developer workflows.

Llm.c – LLM training in simple, pure C/CUDA

HN AI Stories · 736d ago · 9 · open source library inference tutorial

llm.c is a high-performance C/CUDA implementation for LLM pretraining that eliminates heavy dependencies (PyTorch, Python) while achieving 7% faster performance than PyTorch Nightly. It provides clean reference implementations for reproducing GPT-2/GPT-3 models with both GPU (CUDA) and CPU code paths, making it valuable for understanding model training mechanics and CUDA optimization.