Tencent’s Hunyuan team has introduced Hunyuan-A13B, a new open-source large language model built on a sparse Mixture-of-Experts (MoE) architecture. While the model consists of 80 billion total parameters, only 13 billion are…
Category: AI
-
OpenAI Loses Four Key Researchers to Meta
Four OpenAI researchers are leaving the company to go to Meta, two sources confirm to WIRED.
Shengjia Zhao, Shuchao Bi, Jiahui Yu, and Hongyu Ren have joined Meta’s superintelligence team. Their OpenAI Slack profiles have been deactivated. The…
Continue Reading
-
Getting started with Gemini Command Line Interface (CLI)
Google recently released the Gemini CLI, a powerful command-line tool designed to supercharge developer workflows with AI. Whether you’re working across massive codebases, automating tedious tasks, or generating new apps from…
Continue Reading
-
The AI Backlash Keeps Growing Stronger
Before Duolingo wiped its videos from TikTok and Instagram in mid-May, social media engagement was one of the language-learning app’s most recognizable qualities. Its green owl mascot had gone viral multiple times and was well known to younger…
Continue Reading
-
Alibaba Qwen Team Releases Qwen-VLo: A Unified Multimodal Understanding and Generation Model
The Alibaba Qwen team has introduced Qwen-VLo, a new addition to its Qwen model family, designed to unify multimodal understanding and generation within a single framework. Positioned as a powerful creative engine, Qwen-VLo enables…
Continue Reading
-
LayerNorm and RMS Norm in Transformer Models
This post is divided into five parts; they are: • Why Normalization is Needed in Transformers • LayerNorm and Its Implementation • Adaptive LayerNorm • RMS Norm and Its Implementation • Using PyTorch’s Built-in Normalization…
Continue Reading
-
Getting Started with MLFlow for LLM Evaluation
MLflow is a powerful open-source platform for managing the machine learning lifecycle. While it’s traditionally used for tracking model experiments, logging parameters, and managing deployments, MLflow has recently introduced…
Continue Reading
-
Unbabel Introduces TOWER+: A Unified Framework for High-Fidelity Translation and Instruction-Following in Multilingual LLMs
Large language models have driven progress in machine translation, leveraging massive training corpora to translate dozens of languages and dialects while capturing subtle linguistic nuances. Yet, fine-tuning these models for…
Continue Reading
-
OpenAI’s Unreleased AGI Paper Could Complicate Microsoft Negotiations
A small clause inside OpenAI’s contract with Microsoft, once considered a distant hypothetical, has now become a flashpoint in one of the biggest partnerships in tech.
The clause states that if OpenAI’s board ever declares it has developed…
Continue Reading
-
MIT and Mass General Brigham launch joint seed program to accelerate innovations in health | MIT News
Leveraging the strengths of two world-class research institutions, MIT and Mass General Brigham (MGB) recently celebrated the launch of the MIT-MGB Seed Program. The new initiative, which is supported by Analog Devices…
Continue Reading