Error in syncemitter 'extruder' step generation

Basic Information:

Printer Model: Voron Trident 300
MCU / Printerboard: Octopus Pro v1
Host / SBC: RPI 4
klippy.log

klippy.zip (919.0 KB)

My printer was working fine yesterday, successfully finished a print, but today when trying to print I started to get:

b'stepcompress o=0 i=0 c=10 a=0: Invalid sequence'
b'stepcompress o=0 i=0 c=10 a=0: Invalid sequence'
b'stepcompress o=0 i=0 c=10 a=0: Invalid sequence'
b"Error in syncemitter 'extruder' step generation"
Exception in flush_handler
Traceback (most recent call last):
File "/home/pi/klipper/klippy/extras/motion_queuing.py", line 198, in \_flush_handler
self.\_advance_flush_time(0., want_sg_time)
File "/home/pi/klipper/klippy/extras/motion_queuing.py", line 156, in \_advance_flush_time
raise self.mcu.error("Internal error in stepcompress")
mcu.error: Internal error in stepcompress
Transition to shutdown state: Exception in flush_handler

very consistently (tried 8 times and it fails the same way every time).
Nothing has changed between yesterday and today.
Tried around 4 times and each time the error is the same and happens right when the toolhead start extruding.

I replaced my EBB36 v1.2 board (I had a spare handy) and the print failed at exact same place.
I’m at a loss here, since nothing has changed since yesterday.

Any help would be appreciated!

A quick look at your logs makes me feel your issue is in the Cartographer and/or Happyhare routines or hardware. Both are known to break on klipper upgrades.

You may want to reach out to them. Let us know if they are no help.

Good luck

Thanks for taking a look!
I haven’t upgraded anything in around a month and I’ve printed tens of hours with the current configuration and software versions.
It worked yesterday evening and stopped working today when trying to reprint the same file that worked yesterday.

What makes you think it’s cartographer or HH? Because the stack trace originates from extras folder?

Hi @Leonti ,

My guess is it’s HH since I’m not aware of Carto messing with the extruder. One way you could test would be to comment out the [include mmu/...] lines at the top of your printer.cfg and manually load filament to your toolhead and run test a print.

If it succeeds, then the issue is likely in HH

If it fails, the issue is likely in Klipper but to be sure I would:

  • Download your mmu config folder somewhere to back it up
  • Run in SSH (this will uninstall HH):
cd ~/Happy-Hare
./install.sh -d
  • Run a test print again

I’ve had VERY mixed results using AI on klipper problems. I normally make it cite sources and half the time they are COMPLETE fabrications. Last time I tried it cited 2 git repos… both came back 404. Use the following at your own risk.

I asked AI (Claude) to summarize your shutdown logs. It barfed out the following:

Per-Shutdown Context

Log 1: klippy.shutdown03782 (~1167 seconds)

  • Print duration: ~1182 seconds (19.7 min)
  • Last activity: Cartographer home sequence ([cartographer] Touch 3: 0.1366)
  • Filament: At ~8.7g extruded
  • Temperatures: Bed 70°C (target), extruder 240°C, cartographer coil 33°C
  • Serial stats: 0 retransmits on main/MMU/Cartographer; 9 retransmits on main CAN
  • Notes: Stopped mid-skirt/brim phase after successful bed mesh calibration

Log 2: klippy.shutdown14907 (~490 seconds)

  • Print duration: ~501 seconds (8.35 min)
  • Last activity: TMC UART register read (extruder driver SPI query)
  • Filament: ~6.9g extruded
  • Temperatures: Bed 70°C, extruder 240°C, cartographer 52°C
  • Serial stats: 9 retransmits on main CAN; clean on others
  • Notes: Very short print, terminated early in skirt phase

Log 3: klippy.shutdown23686 (~469 seconds)

  • Print duration: ~479 seconds (7.98 min)
  • Last activity: Immediate; no TMC reads in final burst
  • Filament: ~6.9g extruded
  • Temperatures: Bed 70°C, extruder 240°C, cartographer 53°C
  • Serial stats: Clean on all buses except main (9 retransmits)
  • Notes: Earliest of the “short” failures; stopped at ~0.2mm layer height

Log 4: klippy.shutdown34894 (~538 seconds)

  • Print duration: ~548 seconds (9.13 min)
  • Last activity: get_clock exchange across all MCUs
  • Filament: ~8.6g extruded
  • Temperatures: Bed 80°C, extruder 220°C, cartographer 55.5°C
  • Serial stats: 0 retransmits (clean session)
  • Notes: Different bed temp (80°C) and lower extruder temp (220°C); still fails identically

Log 5: klippy.shutdown44127 (~839 seconds)

  • Print duration: ~849 seconds (14.15 min)
  • Last activity: TMC UART exchanges (extensive motor driver diagnostics)
  • Filament: ~3.7g extruded
  • Temperatures: Bed 80°C, extruder 220°C, cartographer 53.4°C
  • Serial stats: 9 retransmits on main CAN; otherwise clean
  • Notes: Longest run before failure; similar motion pattern to logs 2–3

Unified Analysis

Why It Fails

The error occurs in _advance_flush_time(), which manages the step compression buffer that queues motion commands to the MCU. When Klipper’s host-side motion planning gets ahead of (or falls behind) the MCU’s ability to process step commands, the buffer state becomes inconsistent. The raise self.mcu.error() is a safety fence—it stops everything rather than queue corrupted steps.

Common Precursors Across All 5

  1. All occur early in print (490–1167 seconds, mostly <600s)
  2. All involve active motion: Either homing/probing, skirt/brim, or both
  3. All show stable clock sync at shutdown (min_half_rtt ~0.000066–0.000164 s)
  4. All show clean I/O on non-CAN buses (zero retransmits on main MCU serial, except main reports 9)
  5. All have active Cartographer streaming (mcu_awake & mcu_task_avg show activity)

Hypotheses

Hypothesis 1: Motion Planner Starvation (Most Likely)

The motion queue is being drained faster than the host can refill it, causing _advance_flush_time() to detect a gap. Possible triggers:

  • Cartographer stream overload: The probe is consuming CPU cycles during homing, delaying motion refills.
  • Triple CAN bus contention: EBB + MMU + Cartographer all on the same CAN line, creating jitter in clock sync feedback loops.

Why This System Is Vulnerable

1. Cartographer Streaming Overhead (Primary)

The Cartographer probe streams sensor data continuously during homing. Unlike mechanical probes (silent), Cartographer generates CAN traffic:

[cartographer] Starting stream
[cartographer] Touch 1: 0.1285
[cartographer] Touch 2: 0.1285
[cartographer] Touch 3: 0.1366
[cartographer] Stopping stream

Each event = CAN bus interrupt. During the 900-point mesh, this creates 900 CAN events while the motion planner tries to transition to skirt/brim.

2. Single Extruder on CAN (Not Ideal)

Although there’s only one extruder, it’s on the EBB36 via CAN. The Octopus Pro v1 isn’t talking directly to the stepper driver; it’s going through CAN to EBB. This adds:

  • CAN latency (jitter in clock sync)
  • Serialization (EBB can only process one command at a time)
  • No priority buffering (Cartographer telemetry and extruder commands share bandwidth)

3. 30Ă—30 Mesh Is Aggressive for This Setup

  • 900 probe cycles = 900 Ă— ~500ms = 7.5 minutes of continuous Cartographer streaming
  • Each probe triggers Z-motor moves (stepper_z, z1, z2)
  • Post-probe, the mesh data is bicubic-interpolated (CPU-heavy on Pi4)
  • Then immediately skirt/brim starts (high motion frequency)

Perfect storm: Mesh post-processing still draining CPU when XY motion begins.

4. RPi 4 CPU Headroom

Klipper’s motion planner is single-threaded. When:

  • Cartographer driver processes telemetry
  • Bicubic mesh interpolation runs (matrix math)
  • MMU heartbeat polls
  • Moonraker/web UI runs in background

The motion planner gets context-switched out. Motion queue drains faster than it refills → stepcompress fails.

I think it is related to Happy Hare.

I was printing those prints with MMU_ENABLE = 0, once I went back to MMU_ENABLE = 1 I haven’t had this problem since (5 prints already).

I’m still confused though, since I did a print yesterday with MMU_ENABLE = 0 and it was successful. Same GCODE the next day didn’t work.
My only theory is that HH loads something it needs in memory with MMU_ENABLE = 1 so when I switched to MMU_ENABLE = 0 it was still there and the print worked.
On the next day I started my printer fresh, so whatever was loaded with MMU_ENABLE = 1 was now gone.

I just want to report I had the exact same issue as Leonti. No updates or changes made, mmu enable=0 was working fine one day. The next day I went to print something the same way and the same problems occurred.
After turning the MMU back on and using it works fine. I have not tried going back to mmu enable=0 yet.

I understand this probably something that be brought up with the Happy Hare github, but I figured I would at least post here for future troubleshooters to see this occurred to someone else.

Welcome Poisson,

It’s always a good idea to post a klippy.log after such a finding. From a klippy.log we may find reasons for that error.

Sure, here it is

klippy(12).zip (1.8 MB)

The only thing that I can say for now is that it seems that regardless of specific Klipper version, every time I see a similar error, there is an MMU defined in the config.

I guess the only similarity that I can pinpoint is:

b'stepcompress o=2 i=0 c=9 a=0: Invalid sequence'
...
stepcompress error info: {'queue_name': 'extruder', 'flush_time': 2105.6280557730765, 'step_gen_time': 2105.6780557730767, 'last_flush_time': 2105.3780557730765, 'last_step_gen_time': 2105.4280557730767, 'clear_history_time': 2074.046422175}
...
move 24: pt=2105.580410 mt=1.083333 sv=60.000000 a=0.000000 sp=(249.880000,141.259000,0.222402) ar=(0.000000,1.000000,0.000532)
move 25: pt=2106.663743 mt=0.052584 sv=60.000000 a=0.000000 sp=(249.880000,206.259000,0.256991) ar=(0.000000,0.999996,0.002988)
...
move 19: pt=2105.580410 mt=1.083333 sv=2.366362 a=0.000000 sp=(77.775630,0.000000,0.000000) ar=(1.000000,1.000000,0.000000)
move 20: pt=2106.663743 mt=0.052584 sv=2.366351 a=0.000000 sp=(80.339189,0.000000,0.000000) ar=(1.000000,1.000000,0.000000)

It seems to happen inside a move with constant velocity? Well.
So, it is questionable how that happened, that itersolve outputs infinite speed jumps.

I can only guess that maybe there is an invisible RT distance change underneath?

-Timofey


Similar thread in klipper discord: Discord

Investigated and I have a suspicion:

Under the motion_queuing model that klipper has implemented late last year, I believe each stepper’s step generator has a scan window that looks ahead/behind print_time.

When the extruder (don’t forget, HH treats the MMU stepper as an “extruder” that is “attached” to the main extruder) is moved to a new trapq, which happens on sync / resync, I think that its scan window must be recomputed before any steps are emitted.

I believe HH skips that in the v3 branch. The extruder syncemitter emits its first steps against a stale scan window that overlaps the fence/history boundary, therefore clocks aren’t strictly increasing therefore getting an invalid sequence.

Happy Hare v4 has reworked that area substantially and this issue is corrected. It has moved the mmu stepper model to a completely different approach preventing swapping the extruder stepper between two toolheads’ trapqs. V4 doesnt use the “two extruders” hack to keep the mmu syncronised with the extruder any more so this should not materialise in the future.

A faster PI helps mask this race condition in v3 btw.