EMI and connection lost error

Basic Information:

Printer Model: CR-10S
MCU / Printerboard: BTT SKR E3 Mini V3.0
klippy.log
klippy.log
moonraker.log
moonraker.log (196.0 KB)

Describe your issue:

I’ve been experiencing random lost connections for a month or so. Mid print, the SBC stops and errors out "Lost MCU connection ‘mcu’ ". What I’ve done to diagnose it:

Changed the SBC to a different one, (Odroid XU4 → OrangePi 3LTS)
Changed Linux distro (Debian Bullseye, Dietpi 8.2.x)
Changed USB cable/taped the 5V pin
Reinstalled/reflashed everything
Lowered Baud to 115200
Tested the connection via the Command Dispatch Benchmark
Changed SBC and cable routes, trying to lower possible interferences

The klippy.log is quite unhelpful. It shows a Got EOF when reading from device (line 11954.0) after a small spike in bytes_retransmit (from 9 to 44). While this does usually indicate a poor USB cable, I don’t think it’s the true cause here. I’ve changed a couple, with no observable differences.

Moonraker and dmesg show nothing of relevant (other than the usual USB reconnect line). Interestingly, after an error-induced shutdown, Moonraker has trouble reconnecting to Klipper and hangs on the connecting screen until you reboot the SBC. I’m running two instances, but I couldn’t assess whether the second printer (undergoing “maintenance”) is affected too. I’m starting to run out of options here.

[Update]

After it occurred again, I tried to reconnect immediately, which resulted in Klipper hanging indefinitely in the “Startup” phase. No amount of firmware restart solved it, only rebooting the SBC. It’s also interesting how Mainsail showed the MCU with its load during startup, meaning Klipper was able to get an initial connection. So it’s a MCU firmware issue?

Have you checked your bulk (12V/24V) power supply?

If you get a voltage underage or a spike you could get what you’re seeing.

The interruptions happen well after the print has started, hours in usually. If it was a PSU issue and an undervolting was occurring, it would probably happen during the heat up phase. The board is also (theoretically) voltage-regulated in two separate spots. I will try to switch PSUs anyway.

Do you have an old laptop or PC? If yes, you may try to replace Odroid XU4 ant/or OrangePi 3LTS with that and try if it works.

I’ve not tried a third platform yet, and I think that I’m going to do just that.

But first I want to try and connect it via UART instead of USB. Given the symptoms, it looks like there’s something that’s truncating or obstructing the connection specifically between the port and the MCU. It might be worth it to check if it’s an USB issue.

I’m inclined to think that something on the board simply stopped working. Probably a filter or a varistor, by looking at the schematics.