I appreciate the information, much of those links I had already read. However I had not found the Timeout with MCU link in my searches, although I am not sure how that slipped through my search ‘net’. Thanks for that.
I just wonder if one of the TMC2209 drivers went bad, if it would cause this issue? The first errors were tmcuart_response timeouts. Which made me think maybe the driver itself went bad. I have had bad drivers before, but usually they show up as missed steps or just “off” in terms of accuracy.
I had meant to follow up last night with a few additional points.
- As part of my first attempts to resolve the TTC I had added active cooling to the RPI and reimaged onto a brand new Sandisk Ultra card. (was running on a Sandisk Ultra already, but an older one).
- This was built “Doom” style, so there are no high voltage wires in the same compartment as the electronics. All mains power is underneath, all LV is on top. So EMI is less of a suspect, but I still try not to run the CAN cables off on their own, away from the 24vdc/5vdc power lines.
- I have read about HH causing issues with Klipper. Unfortunately I found that out after I had ripped out half my wiring and purchased the extra electronics.
- I understand that 3rd party ‘add ons’ are not supported or guaranteed, i am not looking for support for them, was just asking a specific question about bytes_retransmit.
This link that you provided does seem to support my suspicion that the MCU (Spider 1.0 board) is faulty/dying.
Timeout with MCU / Lost communication with MCU occurs when the host does no longer receive data from the MCU. It can be the wiring or something made the MCU crash (e.g. due to some faulty hardware), so it just no longer reacts.
The bit about “or something made the MCU crash (e.g. due to some faulty hardware)”. I started to suspect that when I saw errors on UART then on CAN. I swapped to CAN as I modified my UART cable to take out the 5v power when I employed the RS25 to power the RPI, thinking I had possibly borked the connector, cable or connection somehow. Especially with the bytes_invalid on the UART connection. I wish there was an MCU specific log file to scrape/review, like putting a specific MCU into “debug logging” mode. I am learning more about the various deeper-dive Klipper troubleshooting tools, does something like that exist?
I am going to reflash the main board for USB communication and see if there are still errors (I am anticipating that there will be). But because I may have screwed up the UART cable (not sure how I could have jut by removing the 5v pins, but who knows) and the can tranceiver pins on the 1.0 Spider boards were not really recommended (forgot where i read that), it is good to rule out all possibilities.
I also have been thinking about changing from SD Card to an NVMe on the RPI. That should help with the HH related issues and now that i no longer use the UART pins on the RPI I can use an NVMe hat. Oh, and the RPI is a 4B 2GB, I never see the memory completely flat line, but was thinking of upgrading to a 4gb model, but that is just a last resort.
I am very much open to suggestion here and appreciate the feedback. Thanks!