When the GPU is 99% overhead: measuring WebGPU dispatch cost in a tiny physics engine
June 10, 2026
Our soft-body engine ran 176 GPU dispatches a frame to simulate 192 particles. We measured where the time went — frame cost fit ~1.7 ms + ~12 µs per dispatch, independent of the workload — and ended up shipping a CPU solver in WebAssembly that runs the same frame 5–15× faster.