Jelly48

We picked Rust for the platforms. The surprise payoff was the AI collaboration.

June 11, 2026

Two years of received wisdom says you pick Rust for performance and pay for it in development speed. We picked it for a narrower reason: we wanted one physics engine — a 2D soft-body XPBD solver — to ship inside a browser game, a native iOS app, and a desktop dev build without maintaining three engines. That bet paid off about how we hoped. The bet we didn't know we were making is on how much day-to-day development would happen — as it now does for most developers — in collaboration with AI coding tools. Rust turned out to be a quietly enormous advantage there, for reasons that have nothing to do with performance.

The multiplatform bet, with receipts

The stack is wgpu (GPU abstraction), winit (windowing), and our own engine crate. The same Rust compiles to:

Web: wasm via wasm-bindgen, rendering through WebGPU — this is Jelly48, live today.
iOS: a static library inside a thin Xcode shell, rendering through Metal. Rust owns main and the render loop; Swift exists only as a bridge for platform services (ads, StoreKit, haptics, iCloud, audio session).
macOS desktop: the development build, same winit window, same Metal.

The honest headline: when we first pointed our second game (a Suika-like called Jellygon, coming to iOS) at the iPhone simulator, it ran with zero engine or game-code changes. Not "minor porting" — zero. The day-one gotcha list was entirely build plumbing: the iOS simulator SDK has to be on SDKROOT, a static library links where a cdylib won't, and one machine's conda environment exported an LD that broke Apple's linker. Annoying, Googleable, and all outside the language.

The architecture that makes this stay true is worth stating because it's boring: platform services live behind one bridge module of extern "C" functions, cargo-feature-gated, with no-op fallbacks. The Swift side is ~600 lines for a full commercial surface (AdMob with consent flow, StoreKit 2, iCloud key-value sync, an AVAudioEngine port of the web's audio synth). Game logic never branches on platform; it calls haptic(0.6) and somewhere that's a Taptic Engine or nothing.

Two costs, honestly: WebGPU-in-the-browser is the weakest leg (one mobile Safari rendering stall we ultimately accepted as the platform's floor and solved by going native — which was the plan anyway), and wasm/Xcode build glue is real ongoing friction that a Unity developer never sees. We'd take the trade again.

The collaboration payoff nobody sold us

Here's the part we didn't plan. Like most developers now, we work with AI agents throughout — features, refactors, test infrastructure, debugging. The question that actually determines whether that collaboration is productive isn't "can the AI write code?" — it's where your attention goes while it does. In permissive languages, the human becomes the linter: reviewing for null slips, type confusions, half-remembered APIs — mechanical vigilance, the least valuable use of a developer's judgment. After months of building this way, our strong opinion: Rust restructures the division of labor, and the reasons are structural, not vibes.

1. The compiler is the agent's first reviewer — so you don't have to be. An agent's failure mode isn't typos; it's confident code that's subtly wrong: a use-after-move, a forgotten platform branch, an API misremembered from training data. Rust converts a remarkable share of those from runtime surprises into compile errors. When we widened a platform module to a new target, the compiler enumerated every call site that assumption broke — the fix was exactly that list and nothing else. In a dynamic language those would have been six production incidents on a slow fuse — or six things a human reviewer had to catch by being careful. The borrow checker that famously fights human beginners is, in a collaborative loop, a tireless first-pass reviewer that never lets attention drift. Agents don't get frustrated by it; they fix the error and move on — and the human review that follows gets to be about design and taste instead of defect-hunting.

2. Everything is greppable, because everything is explicit. Agents navigate codebases by search, not memory. Rust rewards that: no monkey patching, no runtime registration, no spooky action at a distance. The public surface of a module is literally the list of pub fns. Options force the "what if it's missing" conversation at the call site, in text an agent can see, instead of in a null check someone forgot.

3. Cargo means the agent already knows your project layout. Build is cargo build, tests are cargo test, harnesses live in examples/, features gate platforms. Our headless tools — a deterministic repro harness, a bot playtester, a GPU benchmark — are all just cargo run --example …. There's no bespoke build system for the agent to misunderstand, which in practice is a bigger deal than it sounds: agents waste most of their errors on project-shape guesses, not logic.

4. Determinism lets debugging and verification happen autonomously. This is the deepest one. The collaboration ceiling isn't set by how well an agent writes code — it's by how much of the verify-debug-retry loop it can run without you. An agent's claim that "this change is safe" is worth nothing; a bit-identical replay is worth everything. Because the sim is deterministic and seedable, we could replay an identical scripted game on the engine as last deployed versus current head and diff the output — byte for byte — to prove a batch of engine changes had zero effect on the shipped game. The same property powers the rest of our verification pyramid, all of it agent-runnable: physics soak tests, a bot playtest harness that found a latent production bug a 25-second regression suite never could, and browser-level probes driving the real wasm bundle in headless Chrome. The agent doesn't ask to be trusted; it runs the gauntlet and reports green — and the human enters the loop at "review the findings," not "reproduce the bug."

A fair objection: couldn't you build deterministic harnesses in any language? Yes — but Rust's culture and constraints mean you actually do. No global mutable state sneaking in, Send/Sync policing the threading, one toolchain to wire it all together. Much of that harness code came out of AI-assisted sessions too, which is the loop closing: the language makes it cheap to build the tools that make the collaboration trustworthy.

5. Where it cuts the other way. Compile times tax the edit-verify loop (ours run 5–15 s incremental — fine; a big workspace would hurt). Generated code sometimes over-clones to satisfy the borrow checker, and an agent will happily ship the clone where a human would restructure — taste remains your job. And training data for wgpu/winit moves fast enough that agents reach for last year's API and need the compiler to walk them forward. Note what all three have in common: the failure is visible at build time or in review, not latent in production.

The combined thesis

These two payoffs compound. Multiplatform-from-one-codebase means there is one codebase for the agent to learn, one test suite that vouches for every target, one deterministic sim that answers "did anything change?" for web, iOS, and desktop simultaneously. When an afternoon of AI-assisted engine work is provably bit-identical for the shipped game and soak-tested for the unshipped one, a small team working with agents stops being a compromise and starts being a structural advantage: the marginal cost of the second platform — and the second game — collapses, and the humans spend their hours on the parts that actually need a human.

We didn't choose Rust because it's a good language to share with machines. But if you're starting a project today expecting AI collaboration to be part of the workflow — and statistically, you are — that property deserves a slot on your decision matrix right next to performance. Possibly above it.

The engine all of this describes is running in your browser tab away — soft bodies, WebAssembly, and all.

Play Jelly48 →