Blog

4 hours ago

How Nvidia Made Its ASR Models 3x Faster Than the Competition

Nvidia's Parakeet models sit 3x clear of the rest of the Hugging Face Open ASR Leaderboard on throughput, with competitive accuracy. The reason is the Token-and-Duration Transducer (TDT), a small modification to RNN-T that adds a second head predicting how many encoder frames each token covers. Instead of advancing one frame at a time, the decoder skips. The result is up to 2.82x faster inference at comparable or better word error rate.

Source: HackerNoon →


Share

BTCBTC
$80,641.00
1.21%
ETHETH
$2,282.14
2.32%
USDTUSDT
$1.000
0%
BNBBNB
$665.43
0.49%
XRPXRP
$1.44
2.32%
USDCUSDC
$1.000
0%
SOLSOL
$94.65
2.58%
TRXTRX
$0.349
0.53%
FIGR_HELOCFIGR_HELOC
$1.04
0.73%
DOGEDOGE
$0.110
0.8%
WBTWBT
$59.15
1.49%
USDSUSDS
$1.000
0.01%
ADAADA
$0.272
2.79%
HYPEHYPE
$40.46
3.55%
ZECZEC
$570.96
2.76%
LEOLEO
$9.99
2.8%
BCHBCH
$439.86
2.26%
XMRXMR
$411.65
1.65%
LINKLINK
$10.31
2.29%
TONTON
$2.31
4.65%