MCU 'mcu' shutdown:

It’s really the same one - just that Klipper shuts down on a missed heater event rather than having an event queued up that it cant’ respond to:

Sorry, the only suggestion I have for you is to start turning off accessories (web cams would bee next on my list, followed by other apps you have running) until you get a good print and start putting them back (in the reverse order that you took them out) until things stop working again - then you have the cause of your problem.

Don’t have webcams and stuff, so not sure what is that I can disable…

Don’t know…
…regarding your klippy.log “klippy0312_A.log”…



Don’t know about the bandwidth highs!

Good luck, hcet14

1 Like

Thank you hcet14!

So there is something that consumes bandwidth, right? Hmm…

What are the possible consumers in this case?

I am still working on motion report, and I got this motion graph. The question is that I am still confused why the dataset is all zero. Is it supposed to define time length as I run the command or motion_report.py has to be running when the printer is printing. Thanks!

I have no clue and think that is the wrong direction I pointed to!

The picture above was made with the old version of Sineos excellent “Graphstats webpage” tool. You may try the new version, which has much more features https://klipper.discourse.group/t/rework-of-the-graphstats-webpage.

I think you see the error very good here (pink)


I spend an hour now to find anything curious in your latest klippy.log. Not much success :unamused:

Line 3237: "Timeout with MCU 'MKS_THR' (eventtime=2342.829690)"
is the error before. “Could” point to the CAN bus?

You might follow Sineos “Knowledge Base” https://klipper.discourse.group/t/missed-scheduling-of-next-digital-out-event. First thing I would do “Disk errors / dying SD card” try a new one since it is the easiest and cheapest thing to do.

Again, good luck!

Thank you very much for spending your time on this!

I am using an EMMC. Do you still think that it could be issue and I should try an SD card instead?

It’s an interesting problem and a good training session for me to get more familiar interpreting a frickeling klippy.log :wink:

Both are silicon based memories. They wear out and produce problems then. I don’t know for how many write cycles an EMMC is specified. I would get a new EMMC or switch to SDcard, if they are cheaper.

For some reason this doesn’t work for me - I don’t feel I am being trained and I don’t feel I do things out of understanding them. Anything electronics or mechanics, I can get, but all the Linux/communication/MCUs I can’t get my head around.

Anyway, I copied the OS image to an SD card, and it’s already running. Will report shortly.

Years ago I promised myself “Windows 10” will be my last version. Right now, I’m writing on a “Windows 11” system. Redmond sucks, after the death of “Windows 11” I’ll leave Redmond for good :rofl:

:laughing:

Thank you very much for this! I did discover the old version, but was not aware of this one. It looks neater, showing all the MCUs separately and lots of other things that I don’t know what they mean LOL (srtt???)

Good question, I have no clue. Maybe someone can enlighten us?

What is srtt, rttvar, and rto?

Is there a setting for automatic log rollover?

After replacing the SD card I started a print and it was the best print as far as watched it. It printed almost 2 hours (I think) and was not crashing (all the previous attempts were crashing before reaching the 1 hour mark - actually around 40 minutes, to be more specific). Then I left it printing but in the morning discovered that it crashed.

Thing is, that sineos graphing script reports that exactly at midnight a log rollover occurred and displays no useful info prior or at the crash. However, the log file is very heavy, meaning it did record a lot of data. A bug in the script may be?

klippy0512_A.zip (2.2 MB)

So I ran the print again and around 50 minutes mark it crashed.
klippy0512_B.log (3.2 MB)

Graphic the log, I don’t see a correlation between the loads/bandwidth and the crash event. What I do see is that the crash happened right after MCU ready_bytes spike:

What would that mean…?

Increasing ready_bytes count usually indicates an issue with the communication.

This is also confirmed by the following MCU 'mcu' shutdown: Timer too close error.
This error is a system (often hardware) instability and the known reasons / possible solutions are listed in the topic that @mykepredko already linked above. Unfortunately, there is not much to add, otherwise.

Well, I went ahead and added an additional 24V power supply, just in case. I cut all the excessive lengths of limits switch wires and made them exactly the right length. I checked again the connections. I used a thicker wire to connect the CAN cable shield to ground. Again made sure that high power wires do not go parallel to signal wires. I started the print again and this time it crashed with a different error:
Lost communication with MCU 'mcu'

I assume that this error is trigerred by the same problem as the previous errors.

I followed prettty much everything listed in this topic.

I do not have additional hardware except from klipper screen. I am not aware of any processes running in parallel. No webcams. The SD card has already been replaced. As far as I can tell I don’t have any macros that overload the system.

Only thing I am not sure about is whether I used the correct Clock Reference. How do I confirm I used the correct one?

Did you always used the same Orca sliced data in this topic?

You mean for testing purposes along this thread? Yes, mostly the same one, but today I just sliced a primitive (cylinder) in Orca and sent it to print - still crashed.

Now I had something new happen: I changed my can0 from:

allow-hotplug can0
iface can0 can static
bitrate 500000
up ifconfig $IFACE txqueuelen 1024

To:

allow-hotplug can0
iface can0 can static
bitrate 500000
up ifconfig $IFACE txqueuelen 2048

And the print crashed with a “timer too close” error, but then after a firmware restart, about 5 minutes later, while the printer was idle it lost communication with a Lost communication with MCU 'mcu error.

EDIT:
Here is the latest log:
klippy0512_D.zip (1001.3 KB)

For debugging purposes I commented out the Include for the toolhead board from printer.cfg, so that it doesn’t attempt to connect. I made the corresponding config changes to fool klipper and force it load (defined other pins for mandatory things like thermistors etc).
Klipper connected and I started the same 5 hour print that was failing before reaching the 1 hours mark. So far it’s been running for 2 hours “printing” (no filament, no hotent heating and no extruder rotating lol) and it looks like it is not going to crash.

This experiment rules out any other possible cause for the crashing aside from the CAN communication between toolhead and the main board.

I am using a 4 core shielded cable for the CAN, but the pairs are not twisted. May be this should be the nex thing to be replaced…?