Fast closed-loop controllers on the MCU?

IMO the way to do closed loop in klipper is to modify stepcompress to output encoder positions at the position loop’s constant update rate instead of step positions at a time varying rate. Then each MCU is responsible for maintaining a bound on basically the I term of the position loop PID and throwing a shutdown or “motion stop” trsync signal if that bound ever fails.

So high level planning and kins and everything change minimally, and as long as your system has enough stiffness to not bounce around too badly and you tune it well / set the right bounds for said bouncing multiple MCUs should stay in sync.

in practice you’ll need to tune the “lag term” at speed per axis to get circular circles etc if you want coordinated multi axis moves but that’s doable automagically I think.