News

Oct 10, 2025

Optimizing LLM Pre-Training: Muon, Latent Attention, and MoE in Practice

Muon is a geometry-aware optimizer that halves training time for large language models. It uses polar decomposition and spectral n...

Are you a journalist or an editor?

BTC

$67,073.00

▲ 2.42%

ETH

$1,966.36

▲ 2.77%

USDT

$1.00

▲ 0.03%

BNB

$617.85

▲ 1.33%

XRP

$1.38

▲ 2.34%

USDC

$1.000

▲ 0.01%

SOL

$84.68

▲ 4.38%

TRX

$0.282

▼ 0.23%

FIGR_HELOC

$1.03

▼ 1.82%

DOGE

$0.0942

▲ 1.59%

WBT

$49.94

▲ 2.3%

ADA

$0.281

▲ 1.91%

USDS

$0.999

▼ 0.14%

BCH

$457.02

▼ 0.59%

LEO

$8.92

▲ 0.87%

HYPE

$30.67

▲ 13.65%

$0.167

▼ 0.49%

XMR

$341.89

▲ 2.58%

LINK

$8.85

▲ 2.46%

USDE

$0.999

▲ 0%