
Some optimization problems are not slow because the math is hard. They are slow because every evaluation is expensive — a radar shot, an ultrasonic ping, a quantum circuit execution, a power-flow solve. When measurements are the bottleneck, the question stops being “how good is my optimizer in the limit” and becomes “how good is it after twenty measurements.”
Most general-purpose optimizers spend measurements learning the shape of the landscape. Finite-difference methods, SPSA, and parameter-shift estimators all perturb the system and read it back, several times, to approximate a single gradient step. When each read is cheap, that overhead is invisible. When each read is a slice of instrument time, a shot on shared quantum hardware, or a destructive test, the overhead is the whole story.
In those settings the bottleneck is not compute and it is not convergence theory. It is the number of times you are allowed to touch the device before you have to act.
If you can only afford a few hundred measurements, the asymptotic guarantee of an optimizer is irrelevant. What matters is anytime quality: how good is the best configuration you hold at the budget you can actually pay? A method that converges beautifully after a million evaluations and a method that is usable after fifty are answering different questions — and most real instruments live closer to fifty.
SWC keeps the device configuration as retained state and moves it using the residual it actually measures, rather than spending extra measurements to approximate a gradient first. Each measurement is turned directly into the next step, so the earliest rounds are already heading toward the answer instead of mapping the terrain. That is the entire source of the efficiency: fewer measurements spent on bookkeeping, more spent on arriving.
In simulation, the advantage shows up exactly where you would expect: tight measurement budgets, drifting targets, and delay-bound loops. Holding a measured output at a setpoint under drift, at one measurement per round, it tracks 3 to 10 times tighter than tuned PI and one-measurement SPSA; on related-task streams retained state cuts re-convergence by up to fivefold; and its convergence cost scales as roughly n^1.3, sub-quadratic, so the gap over sequential digital search widens as problems grow. Past the budget crossover — static problems with measurements to spare — classical methods catch up, and that is fine.
The honest framing is the useful one: measurement-efficient optimization is not a universal speedup. It is a decisive advantage in the regime where measurements are the scarce resource — and a tie everywhere else.