Working but absolutely unstable

So in general I am really happy with klipper, yet I have not been able to make it reliable enough so its usable . I am new to klipper yet I have several DIY machines I’ve build as a hobby, 3D printer, drawing bots and CNC machines. I’ve used several boards and programs and even built my own.

I tried to start with the most simple setup A Rpi 3B and an arduino UNO with a protoneer board shield. I have used this reliably with Grbl 0.9 and 1.1 for several months. I installed the 3 stepper drivers (TMC2208) and connected the motors.
I followed the installation instructions, installed a new fresh octoprint, added my klipper “test setup” configuration and after a couple of changes it was up and running. Until now just awesome…
Made some tests sending Gcodes through the octoprint terminal… Impressive how fast and smooth the motors move. No sound whatsoever. Can move up to F15000 without issues! Wow!

Then… I tried to do a print… I have several tests that I use as base drawings for my drawing bot. The Gcodes are only G0,G1,G2 and G3 commands but they are long files that do a complete drawing. I started with a small one that takes around 15mins. And my problems started…

Note: Skip to the end because from now on is just all issues and bla bla bla ----

My original config only had the steppers and seems when leaving the input ports disconnected and not used the print fails with error:
a) MCU ‘mcu’ shutdown: Move queue overflow
So searched around and changed all input/unused pins that are not connected to ^!ar##
Then there is a pullup an logical state inverted so there is always a Logical Enabled in the input. (Endstops for example)
So in octoprint terminal
->restart → FIRMWARE_RESTART-> and wait until status is ready

b) after that:
Lost communication with MCU ‘mcu’
So I find in the forums the following:

These “Lost communication with MCU” errors are typically due to some kind of system level or hardware event.
The host software stopped receiving
any data from the micro-controller for an extended period of time (5+ seconds).
Things to look at would be the usb cables
, the voltage levels (in particular
, make sure the rpi power adapter is powerful enough)
, electronic noise (eg, check for proper grounding), etc.

Another thing to look at is the system logs in /var/log/ -
there may have been a system event that occurred at that time

Now, tried 4 different USB cables and 3 different power supplies (capable of 2.1A) when I looked at the command dmesg through ssh it did not show any issue, but it seems to be triggering in the middle of the print the [ 9.357814] Under-voltage detected! (0x00050005) message.

I thought the Arduino was taking the power out of the RPi. So I separated the power supplies. The Adruino has a 18V power supply for the motors,an indipendant 9V power supply for the board, and the Rpi was connected to the most powerful usb charger I had at home.

In the middle of the program running again under voltage, but now I noticed was the octoprint telling me in the icon. So now I went and connected a USB hub externally powered in between the Rpi and Arduino.
Finally I used the hub to supply the Rpi and direct short cable to Arduino and no more under voltage appeared. But now each time I have an error, it has a different message.
I think they come out of loosing communication in the middle.

c) MCU ‘mcu’ shutdown: Rescheduled timer in the past
Recv: // This generally occurs when the micro-controller has been
Recv: // requested to step at a rate higher than it is capable of
Recv: // obtaining.
Recv: // Once the underlying issue is corrected, use the
Recv: // “FIRMWARE_RESTART” command to reset the firmware, reload the
Recv: // config, and restart the host software.
Recv: // Printer is shutdown
Recv: !! MCU ‘mcu’ shutdown: Rescheduled timer in the past

d) MCU ‘mcu’ shutdown: i2c timeout
I do not even have an i2c device connected! → but
The error comes because there are pins floating in the air, by setting the pins as fixed output with value 1 should correct it.
I added to my config:

[static_digital_output my_output_pins]
pins: ar13, analog0,analog1,analog2,analog4,analog5
#   A comma separated list of pins to be set as GPIO output pins. The
#   pin will be set to a high level unless the pin name is prefaced
#   with "!". This parameter must be provided.

e) MCU ‘mcu’ shutdown: Watchdog timer!
f) MCU ‘mcu’ shutdown: Missed scheduling of next digital out event
g) one with AVR thermal issue but I did not record the line in my log

printer.cfg (35.0 KB)

In conclusion: Technically is working but…
-Keeps disconnecting. Always after several minutes of running. (No under-voltage issue until disconnection in dmesg)
-Gives different issue, but most commonly the Lost communication with MCU 'mcu'
-So I have not been able to run the system to complete any file longer than 15min and this is too unreliable.

  • There is no recovery from any error, so you have to start the print from the start because octoprint neither Klipper report last state, line of code or position before the error, and keeps spamming with temperature responses, so if you are not there, the information is lost.

Until now I’ve tried:
-using different power supplies (4 different ones, including USB hub).
-Using the hub between the arduino and the RPi
-Just running it only the arduino and the RPi (No motors, no cables, nothing around)
-Changing the USB cable, Tried 4 different types, best result with the first one I had.
-Using a separate power supply for each component.
-Changed configuration to set all pins static except the ones used for driving the motors

Does someone have an idea what else can I do? Or why it can be loosing the communication?

I have other Rpi3, Zero and RPi4 available if the issue could be the board (I doubt it), other arduino UNO boards too. I also have SKRmini v2 and SKR1.4 but they are fully functional and configured with Marlin and it would take too much effort to bring them back to that state.

It’s very difficult to diagnose anything without a log file.

That said, looking at your printer.cfg file, your max_velocity settings are too high for that board, especially the Z axis. That’s likely the cause of the rescheduled timer errors.

Ouch. Given the wide variety of errors it seems the micro-controller code is not functioning correctly. What chip is on the board and what compile settings (in “make menuconfig”) did you use when flashing it?

As indicated, we really need the full Klipper log file attached here to provide meaningful assistance.

Cheers,
-Kevin

Thank you for the responses :slight_smile:
They were too many, and I deleted all the ones until the system was running.
Here in order of creation:

work_directory/klippy.log.config0001.cfg
work_directory/klippy.log.shutdown37234
work_directory/klippy.log.shutdown68151
work_directory/klippy.log.shutdown00582
work_directory/klippy.log.config0002.cfg
work_directory/klippy.log.config0003.cfg
work_directory/klippy.log.gcode37234
work_directory/klippy.log.gcode68151
work_directory/klippy.log.gcode29765
work_directory/klippy.log.shutdown39130
work_directory/klippy.log.shutdown65817
work_directory/klippy.log.gcode66571
work_directory/klippy.log.shutdown66571
work_directory/klippy.log.gcode00582
work_directory/klippy.log.gcode39130
work_directory/klippy.log.shutdown29765

klippy_logs.zip (106.4 KB)

I do not think the Z speed is too high since it works for 15 minutes before shutdown, anyway if that may help what max velocity do you suggest I use instead?

Under voltage symptom appear when the system is running.
dmesg_last.zip (7.9 KB)

here the chip

Thanks in advance :slight_smile:

From your dmesg output you suffer from two RPi hardware issues that are most likely big contributor in spoiling your fun:

Undervoltage
In my home I’m running 8 RPis from Gen 1 to Gen 3 for various tasks. Each and every time I had undervolt warnings it was related to either the power-supply or the cable between power-supply and RPi. You need to solve this.

ttyUSB0: usb_serial_generic_read_bulk_callback - urb stopped
This is an error from your Linux system pointing to a problem between RPi and the USB chip on your board. Mostly reported with FTDI Chips.

Possible causes:

  • Flaky USB cable or port
  • Driver / RPi firmware issue

Possible solutions:

  • Change USB cable and / or USB port
  • Update your system
sudo apt update
sudo apt upgrade -y
sudo reboot

after the reboot:

sudo rpi-eeprom-update -a
sudo reboot
  • Update RPi firmwares via sudo rpi-update. WARNING: This command is initially only intended for Raspberry Pi OS. On other distributions it can kill your system. In any case it will push you to the bleeding edge firmwares / kernel so be careful and have a backup.
1 Like

I believe you have 200 in your config, and you appear to be using a leadscrew driven Z axis with a rotation_distance of 8. To achieve 200mm/s it would require 80k steps/s and the motor would need to run at 1500rpm. Neither of those is likely to be possible. I would limit the max_z_velocity to 25.

Thank you jakep_82. Somehow I was expecting the max velocity would not override, the maximum speed that comes from calculating it from the processor’s clock speed. I thought that setting was specifically to set a limit in case the motor was too fast and low toque or skipping movements when stepping too fast. Anyway the G-code sets the speed rate with the F parameter, if its higher than the maximum speed specified will be limited to that. :slight_smile: So I was not expecting it to move that fast, but not to limit it to a value until I try it on the motor to see how it responds.
I set it to 25, and had the same error. I will now try what Sineos suggested.

did the update but the command:

pi@octopipi:~ $ sudo rpi-eeprom-update -a
[sudo] password for pi:
This tool only works with a Raspberry Pi 4

Apologies, I was under the impression you are using an RPi4.
This is correct. RPi 3 and earlier do not have an EEPROM to store bootloader and other relevant boot data. You can skip this step with no adverse effect.

If you want to stress test your RPi to check for power supply / voltage problems you can use following command on the console:

for i in 1 2 3 4 ; do nice -n 20 openssl speed >/dev/null 2>&1 & done

Monitor dmesg for Voltage errors. Stop the test with:

for pid in $( jobs -p ) ; do kill -9 $pid ; done