Mcu timeout during servo move

I’ve just added a servo to my V-Core 3 (BTT Octopus v1.1, Huvud toolboard with CAN) and have started getting shutdowns of the Octopus board, where the servo’s attached. I’m typically able to control the servo as expected, by setting the angle, but I’ve observed instances – occasionally when attempting to set a negative angle, but not always – where the mcu shuts down while commanding the servo move. This evening while attempting a first print after adding @protoloft’s z_calibration extra I’ve consistently got the mcu shutting down in the last stages of my START_PRINT macro. In this case the servo is not stopping at the commanded position (4°) and is in fact shooting clockwise over 0° to more like -45°.

Log attached. The crash happens during the execution of CALIBRATE_Z in START_PRINT, while the DEPLOY_PROBE macro is being invoked by z_calibration.before_switch_gcode. I’ll play around with it more tomorrow.
klippy.log.zip (372.2 KB)

You need to check your log:

probe: TRIGGERED
Error evaluating 'gcode_macro _ASSERT_PROBE_STATE:gcode': CommandError: expected probe state to be deployed but is stowed (1)
Traceback (most recent call last):
  File "/home/pi/klipper/klippy/extras/gcode_macro.py", line 61, in render
    return str(self.template.render(context))
  File "/home/pi/klippy-env/local/lib/python2.7/site-packages/jinja2/environment.py", line 1090, in render
    self.environment.handle_exception()
  File "/home/pi/klippy-env/local/lib/python2.7/site-packages/jinja2/environment.py", line 832, in handle_exception
    reraise(*rewrite_traceback_stack(source=source))
  File "<template>", line 4, in top-level template code
  File "/home/pi/klipper/klippy/extras/gcode_macro.py", line 89, in _action_raise_error
    raise self.printer.command_error(msg)
CommandError: expected probe state to be deployed but is stowed (1)
Error evaluating 'gcode_macro _ASSERT_PROBE_STATE:gcode': CommandError: expected probe state to be deployed but is stowed (1)

Error evaluating 'gcode_macro _ASSERT_PROBE_STATE:gcode': CommandError: expected probe state to be deployed but is stowed (1)

Error evaluating 'gcode_macro _ASSERT_PROBE_STATE:gcode': CommandError: expected probe state to be deployed but is stowed (1)

Error evaluating 'gcode_macro _ASSERT_PROBE_STATE:gcode': CommandError: expected probe state to be deployed but is stowed (1)

Error evaluating 'gcode_macro _ASSERT_PROBE_STATE:gcode': CommandError: expected probe state to be deployed but is stowed (1)

Exiting SD card print (position 43344)
Stats 388917.2: gcodein=0 mcu: mcu_awake=0.006 mcu_task_avg=0.000007 mcu_task_stddev=0.000006 bytes_write=376612 bytes_read=320971 bytes_retransmit=9 bytes_invalid=0 send_seq=16643 receive_seq=16643 retransmit_seq=2 srtt=0.000 rttvar=0.000 rto=0.025 ready_bytes=0 stalled_bytes=264 freq=180001837 huvud: mcu_awake=0.001 mcu_task_avg=0.000009 mcu_task_stddev=0.000008 bytes_write=97859 bytes_read=183447 bytes_retransmit=0 bytes_invalid=0 send_seq=8203 receive_seq=8203 retransmit_seq=0 srtt=0.001 rttvar=0.000 rto=0.025 ready_bytes=0 stalled_bytes=0 freq=72000749 adj=71999957 Octopus: temp=31.9 raspberry_pi: temp=36.5  heater_bed: target=60 temp=59.9 pwm=0.246 Huvud: temp=33.7 chamber: temp=26.8 motor_l: temp=31.2 motor_r: temp=32.0 sysload=0.18 cputime=1316.115 memavail=1597848 print_time=297.673 buffer_time=0.000 print_stall=0 extruder: target=120 temp=119.9 pwm=0.251
Stats 388918.2: gcodein=0 mcu: mcu_awake=0.006 mcu_task_avg=0.000007 mcu_task_stddev=0.000006 bytes_write=376612 bytes_read=320971 bytes_retransmit=9 bytes_invalid=0 send_seq=16643 receive_seq=16643 retransmit_seq=2 srtt=0.000 rttvar=0.000 rto=0.025 ready_bytes=0 stalled_bytes=265 freq=180001837 huvud: mcu_awake=0.001 mcu_task_avg=0.000009 mcu_task_stddev=0.000008 bytes_write=97865 bytes_read=183568 bytes_retransmit=0 bytes_invalid=0 send_seq=8204 receive_seq=8204 retransmit_seq=0 srtt=0.001 rttvar=0.000 rto=0.025 ready_bytes=0 stalled_bytes=0 freq=72000748 adj=71999923 Octopus: temp=31.9 raspberry_pi: temp=36.0  heater_bed: target=60 temp=59.9 pwm=0.246 Huvud: temp=33.5 chamber: temp=26.8 motor_l: temp=31.2 motor_r: temp=32.0 sysload=0.18 cputime=1316.136 memavail=1597848 print_time=297.673 buffer_time=0.000 print_stall=0 extruder: target=120 temp=119.9 pwm=0.251
Timeout with MCU 'mcu' (eventtime=388919.191971)
Transition to shutdown state: Lost communication with MCU 'mcu'

Seems rather macro related than servo related.
Further diagnose would require the same error when using an unmodified Klipper version.

I have checked my log. That macro has been in use for months, long before I added the custom extra. The timeout occurs before the toolhead has even moved into the position where the assertion in the macro is checked, so the mcu has already crashed at that point.

I will reproduce without the extra.

My interpretation:

  • The macro is erroring out (for whatever reason). Can be clearly seen in the error message and the traceback
  • This error is killing the MCU
  • Klipper is reporting a MCU timeout as consequence
  • There is no indication whatsoever relating to a servo

action_raise_error is what’s causing the macro to error out in the previous case; should that be killing the mcu?

Just reproduced the shutdown merely by sending SET_SERVO commands. I removed the extra from the klipper source tree, and removed the config entry. Restarted klipper. Started sending SET_SERVO commands.

Note the console output is bottom-up. I can’t tell if the SET_SERVO SERVO=euclid width=0 command was processed before the error.

klippy.log (206.9 KB)

I don’t see anything in the logs that looks like an error.

Indeed, quite strange but log is still from a modified Klipper version.

Can you explain, please?

You are using extra modules which do not belong to the official Klipper main line.
No developer will consider looking at modified sources that have an unknown effect on overall stability / functionality.

I reproduced the mcu crash, although through brute force.

No extras, clean tree:

pi@vcore3:~/ksrc $ git status
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean
pi@vcore3:~/ksrc $ git rev-parse HEAD
24a1b50e512f3038a51060e3cc8ce2a847b9fafa
pi@vcore3:~/ksrc $ git ls-files --other | grep -E -v '^out/|\.pyc$'
.config
.config.old
klippy/chelper/c_helper.so
lib/hidflash/hid-flash
lib/hidflash/hid-libusb.o
lib/hidflash/main.o
lib/hidflash/rs232.o

macro to exercise the servo:

[gcode_macro servo_loop]
gcode:
    M117 start loop
    {% for _ in range(0, 10) %}
        M117 reset sweep
        SET_SERVO SERVO=euclid ANGLE=90
        G4 P750

        {% for a in range(85, 0, -5) %}
            M117 moving to { a }
            SET_SERVO SERVO=euclid ANGLE={ a }
            G4 P750
        {% endfor %}
    {% endfor %}
    M117 end loop

The log: klippy.log (204.8 KB)

The mcu just seems to die. Is there any debugging I can enable on the mcu side that will provide more info?

Strange indeed. Seems to work for a while and then error out. Have you tried a longer delay in between the steps?
Otherwise no further idea, how to debug this. Maybe @koconnor could jump in.

I’ve tried longer and shorter delays with this loop. Sometimes it’ll run for a few complete sweeps before crashing, as it did here, other times it crashes after 5-6 steps.

I’d be sure to follow the directions at: Frequently Asked Questions - Klipper documentation

Since this seems to be happening when moving the servo, be sure to look for frayed / crimped wires on, near, or in the servo. A good test would be to physically disconnect the servo and see if the commands still cause a failure.

-Kevin

Interestingly, with the servo disconnected I’m not able to reproduce the problem, yet. How could that be? The servo’s just PWM-controlled, right? Why would it being connected or not affect the MCU?

As soon as I reconnect the servo I’m able to reproduce the crash. Disconnecting it and the loop completes. If I increase the time between servo moves to 3000ms the loop will run to completion, but even as low as 2000ms and the mcu crashes. :thinking:

[time passes]

And switching to a different servo with a much longer cable (the one I was originally using when I discovered this problem), I can reproduce the crash with a delay as long as 5s. 6s seems to make it more reliable, but sometimes the servo doesn’t respond to a commanded move for some time (> 1s or more).

[more time passes]

Even with a 6s delay the mcu eventually crashed with the long-cabled servo attached.

Indeed interesting. The servo has no back-channel, so no information are transferred back to the MCU. Basically it is just an electrical load. Tried a different servo?

IIRC not really PWM. Servos operate with pulses between 1ms and 2ms with ~1.5ms is the neutral position. So the 90° working range are positioned with the length of the pulse. If I’m not mistaken, this does not need a PWM pin on the MCU, basically every GPIO can drive a servo.

I just edited my last post. Yes, I’ve tried two servos, with vastly different cable lengths. The longer one is more flaky, but even the short-cabled servo is problematic.