"It runs on my own GPU, so it's basically free. " I believed that until I put a meter on it. So I ran a controlled benchmark on one box — an openSUSE machine with a single RTX 3090 — driving three local models through ollama under an identical fixed workload (256-token generations in a loop for ...
Source: [Dev.to](https://dev.to/sikamikanikobg/how-much-does-it-actually-cost-to-run-a-local-llm-eu-per-million-tokens-measured-jih)