Show HN: Running Gemma-4 26B at 124 tokens/SEC on a CPU, no GPU

Hacker News•Tue, Jun 30, 2026, 12:17 PM•2 min read

I wanted to know how fast a 26B mixture-of-experts model could run on a desktop CPU with no GPU. Got ~40 tok/s single-stream (lossless) and ~124 batched. The surprising part was the byte budget: for this model you compress the output head (32% of per-token bytes), not the experts (16%).

Source: [Hacker News](https://apeg.dev/writing/running-gemma4-26b-on-a-cpu/)

📰 Read Full Story

This is an aggregated headline summary. For the complete report, visit the original publisher.

Continue Reading at Hacker News ↗

#tech #cpu #gpu #experts #model #repo #show #running #gemma

More Headlines

Technology9to5Google• 5m ago

Deals: Galaxy S26 Ultra $350 off, Galaxy Watch Ultra 50% off, Hisense Google smart TVs up to $2,200 off, more

Today’s 9to5Toys Lunch Break is headlined by Samsung now offering $350 off the most affordable Galaxy S26 Ultra models , a chance to score Galaxy Watch Ultra at over 50% off today ($330 under list price), and up to $700 off Samsung Odyssey monitors . From there, we are tracking Alienware gaming ...

TechnologyHacker News• 5m ago

Show HN: Do you want a semantic cache for free with zero code changes?

1 points, 0 comments on Hacker News

TechnologyHacker News• 5m ago

Meta loses bid to dismiss US states' claims that FB, Instagram addict children

1 points, 0 comments on Hacker News

TechnologyDev.to• 5m ago

Graph of Thoughts: when a tree of reasoning isn't enough, let the branches merge

Tree of Thoughts was a genuine leap. Instead of reasoning in one straight line, it branches into several lines, scores them, prunes the dead ends, and searches for the best path — so a puzzle that would sink a single chain of thought becomes solvable. But a tree has one restriction baked right ...

TechnologyDev.to• 6m ago

Blog Post Title:** Nextcloud vs Immich: Which Self-Hosting Solution is

What is Nextcloud and How Does It Work? Nextcloud is an open-source server software for file synchronization, sharing, and collaboration; from installation, its "self-hosting" model gives you complete control over your data. When I decided to set up a central repository for shared files among u...

TechnologyHacker News• 54m ago

Show HN: Ovi AI – AI QA partner that helps startups ship fast at high quality

1 points, 0 comments on Hacker News