Printer Model: Zero G Nebula
MCU / Printerboard: SKR EZ V3 with BTT PI V2
Host / SBC
klippy.log klippy (8).zip (1.8 MB)
Fill out above information andin all cases attach yourklippy.logfile (use zip to compress it, if too big). Pasting yourprinter.cfgis not needed Be sure to check our “Knowledge Base” Category first. Most relevant items, e.g. error messages, are covered there
Describe your issue:
…
Hi all,
At a loss here and looking for assistance.
between 1 two hours into a print i get the MCU communication error, all the time.
Cant finish a print.
This usually happens when I am printing ABS in long runs of +4 hours.
Fail usually happens around 2 hours in.
Ran Graph Klipper Stats and it seems that my MCU is booting in and out for the entire print and at some point decides to shut down.
In the graph it seems to overload to above 250%
Your log only contains the MCU 'mcu' shutdown: Missed scheduling of next digital out event error, but not the one from your screenshot. In any case, both typically have quite similar reasons. Usually and unfortunately, they are tedious to diagnose, often due to subtle hardware instabilities or effects from third-party modifications.
You are correct.
I get both messages regarding the MCU. must have mixed up the logs. but the behavior is the same.
Upgraded to all latest versions and disabled KAMP to see if that is the issue but unfortunately same behavior.
I am starting to think that the 5160T pro’s are the issue here as I also spotted two undervoltage alarms in the logs on X and Y.
Going to double check the wiring.
rechecked all wiring. there was one suspect on the 5160T, replaced the ferrule of a motor wire to the X axis.
Did another print which failed again an hour in with an EBB CAN error.
So ABS runs with a hot bed and a hotter hotend. When you run a colder material, does it run ok past the 2 hour mark every time?
I am asking because I once had an issue with layer shifts at 1 to 2 hours in and it was the heat building up through all hardware and doing its worst on a stepper driver. Maybe you are facing something similar now but it is affecting another element? If you can catagorically eliminate heat, you are one step closer I think.
In that case I suspect the EBB, which is probably mounted right against the extruder stepper which also gets hot?
The coil cannot be the issue I think, as you are not using it during printing.
This is not indicative of an issue in the first place.
What is an issue that according to the log, you are having bytes_retransmit as well as bytes_invalid in your communication.
This points to either hardware issues or potentially kernel issues on the host. More on this here.
I kept my eye out if Bytes_invalid showed anything other than Zero (o).
But the bytes_retransmit thing, if that shows a value other than zero, there is an issue?
I think the ''overload that you see on the screen is indeed not that indicative as the CB2 that I am using has 4 kernels. so hence it might show > 100%
Now running a script to see the memory and CPU values live from the host CB2.
Without wanting to sound rude, it would really help if you read the provided information and follow it. There is unfortunately no “press button A solution”.
I understand. Thinking out loud in my last comment, not asking for press button A solution. But can understand how that can come across. anyways, I am reading up on your link on the Klipper site
Do appreciate the assistance!
I think my problem has been solved.
It was indeed the bytes_retransmit issue which was increasing incremental until the buffer was full and eventually shut down.
Ran the same 4hour print and finished without issues.
Checked the log, no incremental bytes_retransmits
What did I do? Solution:
The CAN0 file was set to the following:
allow-hotplug can0
iface can0 can static
bitrate 1000000
up ifconfig $IFACE txqueuelen 1024
According to the klipper database this should not be set at 1024 but max 128.
SSH’d into the host, looked up the file and changed the text to the following according to Klipper troubelshooting database:
**allow-hotplug can0
iface can0 can static
bitrate 1000000
up ip link set $IFACE txqueuelen 128
**
The above change seems to have solved my issue.
Lucky me cause I was about to pull the CANBUS and run the hotend wired to the mainboard. Already had the wiring ready for it haha! Guess I get to use that for a new build.
Thanks @Sineos for pointing me in the right direction!!!
I hope everyone that sees this thread and has the same issue can fix it now.
It was a very frustrating issue to find.