DiffusionGemma 26B 挑戰 GH200 效能極限

Dev.to•Fri, Jun 19, 2026, 08:02 AM•2 min read

1180 tok/s 的地表極速是什麼概念？在 256 tokens 的輸出下，運算只要 0. 22 秒就瞬間結束，這表示 DiffusionGemma 26B 在 NVIDIA GH200 上跑 vLLM 的速度，整整比 M2 Max 快了 80 倍！延續系列第一篇在 M2 Max 96GB (MLX) 篇中探討地端 Agent「無限 Token 自由」的實驗，當時 Standard 4-bit 雖然擠出了 31. 6 tok/s 的不錯峰值，但面對長 Context（上下文）與多用戶併發請求時，Mac 的排隊機制與記憶體頻寬依然顯得力不從心。為了追求 Production等級部...

Source: [Dev.to](https://dev.to/jh5_pulse/diffusiongemma-26b-tiao-zhan-gh200-xiao-neng-ji-xian-1b24)

📰 Read Full Story

This is an aggregated headline summary. For the complete report, visit the original publisher.

Continue Reading at Dev.to ↗

#tech #tok #vllm #tokens #max #context #diffusiongemma #nvidia #hopper

More Headlines

TechnologyHacker News• 8m ago

Ask HN: What do you use for scientific presentations?

I frequently do scientific presentations has parts of my work, I need to display formulas, graphs and sometimes animations. So far I used Beamer but it feels cumbersome over time. Recently I discovered Manim [0, 1] and Quarto [2].

TechnologyHacker News• 9m ago

Show HN: UAVs FYI – Drone database with supply chain data, API and CLI

Hi HN community, I want to share my UAV/drone database. I made this because have been tired to be asked about who makes it, where it"s made and who using Ardupilot. How it works: Just browse it.

TechnologyHacker News• 9m ago

GLM-5.2: Chop off 84% of the volume from a 1.5TB model, still retain 82% power

3 points, 1 comments on Hacker News

TechnologyHacker News• 11m ago

Claude Artifacts

2 points, 0 comments on Hacker News

TechnologyHacker News• 12m ago

Show HN: One-click fork of "Everything Claude Code" onto an isolated microVM

1 points, 0 comments on Hacker News

TechnologyHacker News• 13m ago

Trillions of dollars spent just to work on customer services?

I came across a couple of articles discussing the bigger opportunities for AI companies to make money. It turns out there are pretty much six different ways to make money if I'm a founder or an operator. I'm not sure if the consensus from the venture capital world is entirely lagging behind wha...