Klipper Continues to Shutdown Midprint *Tears Remaining Hair Out*

Hi All,

I have been STRUGGLING for weeks now and my rpi is constantly shutting down mid print. I feel like I’ve tried most obvious things and I’m all out of ideas so I felt now is an appropriate time to reach out for help.

My Setup

  • voron v0.1 with AC bed
  • pi 4 2gb

What have I tried so far?

  • relashed mainsail 0.40 a couple times
  • 3 or 4 different usb cables for communication to SKR 2.0
    *currently communicating to SKR 2.0 over UART
  • prusa slicer and super slicer
  • upgraded printer PSU from 100w to 150w (meanwell)
  • multiple different charging bricks but currently running a buck converter
  • taped off the 5v power on the usb so the board is not powered by the USB and vice versa

I recently had a print failure so I quickly rebooted the pi and pulled the klippy logs, all pi logs and the gcode

If anyone has any ideas, I would love to hear them

After powering back up the pi, this is the error I see in Mainsail. I’m not sure if it should be like this is to be expected or not.

mainsail after repowering pi

I see the following error in daemon.log, which makes me wonder if it starts at all:

Jun 24 01:41:55 mainsailos python[736]: [moonraker.py:_check_ready()] -
Jun 24 01:41:55 mainsailos python[736]: Option 'serial' in section 'mcu' must be specified
Jun 24 01:41:55 mainsailos python[736]: Once the underlying issue is corrected, use the "RESTART"
Jun 24 01:41:55 mainsailos python[736]: command to reload the config and restart the host software.
Jun 24 01:41:55 mainsailos python[736]: Printer is halted

When you say it stops “mid print”, can you add some more details when you start it when you see it shut down? Ideally a log of actions (“start print at x:xx:xx” etc) so they can be correlated with any logs. If i read the logs it suggests there MCU and klippy don’t communicate at all so the printer won’t be able to actually start a job if so. Is that the case, or are these logs after the print failure?

If it does start the print, does it shut down right away?
On switching power supplies, there’s sometimes a voltage selector. If you are in a 125V zone (e.g. North America) and your power supply is set to 250V (possibly the default setting on meanwell power supplies) then the printer seemingly works, right until you try to draw serious power (e.g. heatbed, hotend heater).
Have you double checked your power supply voltage?

Do you have fresh firmware on the MCU? Run make menuconfig to double check your selection, make clean, and make, and the make sure you have fresh firmware on your board (and not something that’s lagging behind the klipper version).

Update to config: about an hour ago I reconfigured the firmware to using UART5 via the rx/tx pins on the TFT header, I then reflashed the skr 2.0 and rewired the RPi. The SKR connects to the RPi4 and shows READY. I have tried multiple prints, one has finished.

Power Analysis

  • power supply is set to 115v mode
  • Vout to mainboard is 24.20v (while printer is idle)
  • Vout to pi is 5.09v (while printer is idle)

Detailed description of error
Print always begins without issue but the RPi will usually shutdown at a random point while printing. Upon powering back up mainsail shows the error in Post #2 of this thread.

12:46 Powered up the Rpi
12:47 Started a new print
12:52 up to temp and Print began
1:27 Print finished
1:33 Started a second print
1:35 Print crashed
1:40 Power up RPi again to pull logs

LOGS

If possible I would start by not using buck converters and use individual power bricks. I know this isn’t as nice but this could help eliminate any ground shorts or issues with shielding.

Let’s assume you are able to complete prints after removing the buck converters, then add each one back in one at a time completing 2 or more prints after adding each one back into the system. Ideally, if it is a buck converter issue, then you will be able to isolate which one is causing the issue.

Another possibility is the SKR 2 is overheating, unlikely but try opening the case and let a fan blow directly on the board. Similarly for the drivers, they should have the heatsinks attached.

1 Like

this suggest a reboot, not so much a “crash”:

Jun 28 13:34:58 mainsailos systemd[1]: Stopped target Bluetooth.
Jun 28 13:34:58 mainsailos bluetoothd[680]: Terminating
Jun 28 13:34:58 mainsailos systemd[1]: Stopping Bluetooth service...
Jun 28 13:34:58 mainsailos systemd[1]: Stopped target Sound Card.
Jun 28 13:34:58 mainsailos bluetoothd[680]: Stopping SDP server
Jun 28 13:34:58 mainsailos bluetoothd[680]: Exit
Jun 28 13:34:58 mainsailos systemd[1]: Stopped target Graphical Interface.
Jun 28 13:34:58 mainsailos systemd[1]: Stopped target Multi-User System.
Jun 28 13:34:58 mainsailos systemd[1]: Stopping LSB: Klipper daemon...

OS start sequence a few seconds later:

Jun 28 13:35:04 mainsailos fake-hwclock[109]: Mon 28 Jun 17:35:00 UTC 2021
Jun 28 13:35:04 mainsailos systemd-fsck[132]: e2fsck 1.44.5 (15-Dec-2018)
Jun 28 13:35:04 mainsailos systemd-fsck[132]: rootfs: clean, 72975/462384 files, 1029021/1852416 blocks
Jun 28 13:35:04 mainsailos systemd[1]: Started File System Check on Root Device.
Jun 28 13:35:04 mainsailos systemd[1]: Starting Remount Root and Kernel File Systems...
Jun 28 13:35:04 mainsailos systemd[1]: Started Set the console keyboard layout.
Jun 28 13:35:04 mainsailos systemd[1]: Started Remount Root and Kernel File Systems.

i see you have mjpegstreamer enabed but no camera; it shouldn’t do harm but i would rule that out either way.

Similarly, i would be surprised if you’d get indications of system shutdown, but is your power supply able to delivery at least 2.5A to the Pi, and are you not powering peripherals through it; remove anything unneeded (Leds, webcams, etc) that you don’t really need. Make sure your SKR is powered through the 24V power input and not USB (check the jumper; it’s probably ok already).

Do you cool your pi? They can run hot, especially in enclosed situations and next to heaters. As an experiment, try just blowing a fan on it directly; overkill is good in this case. Temporarily take it out of the printer if you can.

Lastly, when a pi reboots, it cleans out /tmp, which is where the klipper loging would have resided that might have a clue what happened before the reboot. I don’t have a pi near me, but i think /etc/systemd/system/klipper.service specifies the location of the logfile with a -l /tmp/klippy.log or something like it. Temporarily change that location to something not on /tmp so that it survives a reboot, and then try to reproduce the problem.

1 Like

I figured it out, thank you so much for your help. This problem has literally been plaguing me for months.

I had a realization this morning that I had this problem across two printers, both running a couple custom scripts. well, I have a script running which creates a power button. I disabled the script and I have been printing all morning… I haven’t confirmed it yet but I suspect that the EMF from the stepper motors was playing with the voltage across GPIO3 which is listening for a change from the power button.

I don’t fully understand the script below but this is what I was using.

#!/usr/bin/env python

import RPi.GPIO as GPIO
import subprocess


GPIO.setmode(GPIO.BCM)
GPIO.setup(3, GPIO.IN, pull_up_down=GPIO.PUD_UP)
GPIO.wait_for_edge(3, GPIO.FALLING)

subprocess.call(['shutdown', '-h', 'now'], shell=False)

Instead I added the following to /boot/config.txt and I get the same function but no EMI issues

dtoverlay=gpio-shutdown,gpio_pin=3

To be honest based on what I have seen in the last two years (or so) playing with 3D printing I am shocked that there aren’t more EMI issues plaguing everyone. Glad you figured this out. I had a similar issue with random behaviour from my BL Touch that I finally solved with wire routing, negative (0V) power referencing & config fixes.

1 Like

Aha, so it was a reboot indeed. Always good when you have issues like this to strip down to the simplest use case.

The gpio-shutdown has a default debounce value of 100msec (which means that it won’t act unless the switch has been pressed for at least 100msec). This also means it doesn’t care about spurious and brief EMI noise.