Error in syncemitter 'extruder' step generation

Basic Information:

Printer Model: Voron Trident 300
MCU / Printerboard: Octopus Pro v1
Host / SBC: RPI 4
klippy.log

klippy.zip (919.0 KB)

My printer was working fine yesterday, successfully finished a print, but today when trying to print I started to get:

b'stepcompress o=0 i=0 c=10 a=0: Invalid sequence'
b'stepcompress o=0 i=0 c=10 a=0: Invalid sequence'
b'stepcompress o=0 i=0 c=10 a=0: Invalid sequence'
b"Error in syncemitter 'extruder' step generation"
Exception in flush_handler
Traceback (most recent call last):
File "/home/pi/klipper/klippy/extras/motion_queuing.py", line 198, in \_flush_handler
self.\_advance_flush_time(0., want_sg_time)
File "/home/pi/klipper/klippy/extras/motion_queuing.py", line 156, in \_advance_flush_time
raise self.mcu.error("Internal error in stepcompress")
mcu.error: Internal error in stepcompress
Transition to shutdown state: Exception in flush_handler

very consistently (tried 8 times and it fails the same way every time).
Nothing has changed between yesterday and today.
Tried around 4 times and each time the error is the same and happens right when the toolhead start extruding.

I replaced my EBB36 v1.2 board (I had a spare handy) and the print failed at exact same place.
I’m at a loss here, since nothing has changed since yesterday.

Any help would be appreciated!

A quick look at your logs makes me feel your issue is in the Cartographer and/or Happyhare routines or hardware. Both are known to break on klipper upgrades.

You may want to reach out to them. Let us know if they are no help.

Good luck

Thanks for taking a look!
I haven’t upgraded anything in around a month and I’ve printed tens of hours with the current configuration and software versions.
It worked yesterday evening and stopped working today when trying to reprint the same file that worked yesterday.

What makes you think it’s cartographer or HH? Because the stack trace originates from extras folder?

Hi @Leonti ,

My guess is it’s HH since I’m not aware of Carto messing with the extruder. One way you could test would be to comment out the [include mmu/...] lines at the top of your printer.cfg and manually load filament to your toolhead and run test a print.

If it succeeds, then the issue is likely in HH

If it fails, the issue is likely in Klipper but to be sure I would:

  • Download your mmu config folder somewhere to back it up
  • Run in SSH (this will uninstall HH):
cd ~/Happy-Hare
./install.sh -d
  • Run a test print again

I’ve had VERY mixed results using AI on klipper problems. I normally make it cite sources and half the time they are COMPLETE fabrications. Last time I tried it cited 2 git repos… both came back 404. Use the following at your own risk.

I asked AI (Claude) to summarize your shutdown logs. It barfed out the following:

Per-Shutdown Context

Log 1: klippy.shutdown03782 (~1167 seconds)

  • Print duration: ~1182 seconds (19.7 min)
  • Last activity: Cartographer home sequence ([cartographer] Touch 3: 0.1366)
  • Filament: At ~8.7g extruded
  • Temperatures: Bed 70°C (target), extruder 240°C, cartographer coil 33°C
  • Serial stats: 0 retransmits on main/MMU/Cartographer; 9 retransmits on main CAN
  • Notes: Stopped mid-skirt/brim phase after successful bed mesh calibration

Log 2: klippy.shutdown14907 (~490 seconds)

  • Print duration: ~501 seconds (8.35 min)
  • Last activity: TMC UART register read (extruder driver SPI query)
  • Filament: ~6.9g extruded
  • Temperatures: Bed 70°C, extruder 240°C, cartographer 52°C
  • Serial stats: 9 retransmits on main CAN; clean on others
  • Notes: Very short print, terminated early in skirt phase

Log 3: klippy.shutdown23686 (~469 seconds)

  • Print duration: ~479 seconds (7.98 min)
  • Last activity: Immediate; no TMC reads in final burst
  • Filament: ~6.9g extruded
  • Temperatures: Bed 70°C, extruder 240°C, cartographer 53°C
  • Serial stats: Clean on all buses except main (9 retransmits)
  • Notes: Earliest of the “short” failures; stopped at ~0.2mm layer height

Log 4: klippy.shutdown34894 (~538 seconds)

  • Print duration: ~548 seconds (9.13 min)
  • Last activity: get_clock exchange across all MCUs
  • Filament: ~8.6g extruded
  • Temperatures: Bed 80°C, extruder 220°C, cartographer 55.5°C
  • Serial stats: 0 retransmits (clean session)
  • Notes: Different bed temp (80°C) and lower extruder temp (220°C); still fails identically

Log 5: klippy.shutdown44127 (~839 seconds)

  • Print duration: ~849 seconds (14.15 min)
  • Last activity: TMC UART exchanges (extensive motor driver diagnostics)
  • Filament: ~3.7g extruded
  • Temperatures: Bed 80°C, extruder 220°C, cartographer 53.4°C
  • Serial stats: 9 retransmits on main CAN; otherwise clean
  • Notes: Longest run before failure; similar motion pattern to logs 2–3

Unified Analysis

Why It Fails

The error occurs in _advance_flush_time(), which manages the step compression buffer that queues motion commands to the MCU. When Klipper’s host-side motion planning gets ahead of (or falls behind) the MCU’s ability to process step commands, the buffer state becomes inconsistent. The raise self.mcu.error() is a safety fence—it stops everything rather than queue corrupted steps.

Common Precursors Across All 5

  1. All occur early in print (490–1167 seconds, mostly <600s)
  2. All involve active motion: Either homing/probing, skirt/brim, or both
  3. All show stable clock sync at shutdown (min_half_rtt ~0.000066–0.000164 s)
  4. All show clean I/O on non-CAN buses (zero retransmits on main MCU serial, except main reports 9)
  5. All have active Cartographer streaming (mcu_awake & mcu_task_avg show activity)

Hypotheses

Hypothesis 1: Motion Planner Starvation (Most Likely)

The motion queue is being drained faster than the host can refill it, causing _advance_flush_time() to detect a gap. Possible triggers:

  • Cartographer stream overload: The probe is consuming CPU cycles during homing, delaying motion refills.
  • Triple CAN bus contention: EBB + MMU + Cartographer all on the same CAN line, creating jitter in clock sync feedback loops.

Why This System Is Vulnerable

1. Cartographer Streaming Overhead (Primary)

The Cartographer probe streams sensor data continuously during homing. Unlike mechanical probes (silent), Cartographer generates CAN traffic:

[cartographer] Starting stream
[cartographer] Touch 1: 0.1285
[cartographer] Touch 2: 0.1285
[cartographer] Touch 3: 0.1366
[cartographer] Stopping stream

Each event = CAN bus interrupt. During the 900-point mesh, this creates 900 CAN events while the motion planner tries to transition to skirt/brim.

2. Single Extruder on CAN (Not Ideal)

Although there’s only one extruder, it’s on the EBB36 via CAN. The Octopus Pro v1 isn’t talking directly to the stepper driver; it’s going through CAN to EBB. This adds:

  • CAN latency (jitter in clock sync)
  • Serialization (EBB can only process one command at a time)
  • No priority buffering (Cartographer telemetry and extruder commands share bandwidth)

3. 30Ă—30 Mesh Is Aggressive for This Setup

  • 900 probe cycles = 900 Ă— ~500ms = 7.5 minutes of continuous Cartographer streaming
  • Each probe triggers Z-motor moves (stepper_z, z1, z2)
  • Post-probe, the mesh data is bicubic-interpolated (CPU-heavy on Pi4)
  • Then immediately skirt/brim starts (high motion frequency)

Perfect storm: Mesh post-processing still draining CPU when XY motion begins.

4. RPi 4 CPU Headroom

Klipper’s motion planner is single-threaded. When:

  • Cartographer driver processes telemetry
  • Bicubic mesh interpolation runs (matrix math)
  • MMU heartbeat polls
  • Moonraker/web UI runs in background

The motion planner gets context-switched out. Motion queue drains faster than it refills → stepcompress fails.

I think it is related to Happy Hare.

I was printing those prints with MMU_ENABLE = 0, once I went back to MMU_ENABLE = 1 I haven’t had this problem since (5 prints already).

I’m still confused though, since I did a print yesterday with MMU_ENABLE = 0 and it was successful. Same GCODE the next day didn’t work.
My only theory is that HH loads something it needs in memory with MMU_ENABLE = 1 so when I switched to MMU_ENABLE = 0 it was still there and the print worked.
On the next day I started my printer fresh, so whatever was loaded with MMU_ENABLE = 1 was now gone.