Fill out above information and in all cases attach yourklippy.logfile. Pasting yourprinter.cfgis not needed
Describe your issue:
Occasionally I have issues with my canbus setup and I’m not sure if it’s a software or hardware problem. Sometimes I get an error during printing and sometimes after some hours of idling. I attached the log of that day. I cut away the stats at the beginning and end. The error should be around here:
b’Got error -1 in can read: (19)No such device’
Timeout with MCU ‘can0’ (eventtime=190242.397807)
Transition to shutdown state: Lost communication with MCU ‘can0’
I’m not sure if I am clouding the OP’s issue, but I have very similar issues. After spending a LOT of time on getting CAN stable. Everything is mostly fine, but I have lost faith in long prints. On the attached log, the print errored out after a number of hours. (5-10… not sure. was not monitoring). On idle things seem fine, but my last print, which had a cylindrical shape (so many steps), I got timeouts and stopped prints. klippy.log (354.2 KB)
Looking at the logs, I’m not even sure CAN is at fault. I might get flamed for not having the correct details.
I’m doubtful this is a hardware error… the fact that the print can run for an extended time (hours, without any issue), indicates that something goes wrong somewhere in the software stack. That might very well be a naive statement, but in my experience hardware issues typically manifest much faster.
For those of you that are more fluent in interpreting Klippy logs:
Is it apparent that the loss of communication with ‘MCU’ is in fact because of can error?
I find the ability to diagnose low-level issues with CAN to be somewhat frustrating, as the log does not say much. and even if you look at the actual can stats, it rarely even shows errors or re-transmits… yet the can interface is no longer operative… and can generally be remedied by a firmware restart… no power cycle is required…
Not sure what exactly this indicates, other than it could be a recoverable state if managed by klipper… Of course, this would be a deal killer for the ‘real-time’ timing requirements…
Is there somewhere I can look to see more detail on the exact nature of the errors? I have read that using an Emergency stop within some timeframe after an error will produce additional log data, but in my situation, it is rather difficult to do that, since the print fails at random… and sitting in front of the interface for 8 hours is not very practical…
I can leave the printer at idle for days. The comms between toolhead (EBB42) and controller (Octopus) has no issues, and The temperature of the hotend updates a couple of times per second. CAN stats look healthy (but I’m not sure they report the truth, at least in bridge mode):
This is @ 1Mbps
@koconnor is there a way to get more detailed can debug info from the bridge MCU?