PSA: CanBus on 32 Bit vs 64 Bit Pi's - "Communication timeout" errors

Okay, Im just testing the settings from that repo (1024), after this I will lower the txq down to 128 and see how that goes.

Just doing a toture test of three benchys wide apart with machine set to rediculous speeds, if anything goes wrong, it will do it under these situtaions.

So far, switching to the 32 bit older OS seems to be far better for me, not sure if others can report the same, I have another machine to sort out with an RPI5, I will duplicate these settings on that one and report here if all goes well, then at least others will no that 32bit is more stable for PI3/4/5’s

1 Like

My results:

Setting txqueuelen to 128 will throw errors and I can’t start a print at all, the follow are the resulting errors:

Move queue overflow
tmcuart_timeout

I had zero transmit errors as well.

Setting txqueuelen to 2048, and the machine is printing perfectly, no issues so far.

Txqueuelen is a setting for the buffer size (filled before transmit). If an error occurs, the buffer has to be filled before retransmitting.

If you are having issues with 128, then the 3B+ cannot keep up, so it makes sense to use the higher number.

Keep in mind too large of a number could end up with ‘timer to close’ errors, so it should be best to find the lowest number that works without errors tying up the bus resending commands.

I’ve found 1024 to be good enough on my Pi 3B+, so I’m surprised you need such I high value for a Pi 4.

That said, I’m going to double check what I’m using ATM, as I’ve forgotten already.

IIRC, He mentioned he’s testing this on a Pi 4.

Yes your correct its a Pi4.

It appears that nothing truly resolves my issues, I tried 128, 256, 1024 and 2048, all result in a random error, some are before the print starts, like “No trigger on z after full movement” some during the print, like Tmcuart No Response, or Timeout issues or heater error that are not really heater errors as the PWM just goes to zero for no apparent reason, I ran the heaters up and did an hour long speed test to see if it was vibration that caused it, not just on one machine but on many, all running similar hardware, and it doesn’t matter if I’m using BLTouchs, microprobes or eddy current type sensors, all of these issues have come about since I have been using multi MCU, in both USB and CAN modes, each thing I do seems to fix it, until it doesn’t, I have also tried a massive array of various different cables for the tool boards, I think I will resign to the fact that all of the Makerbase gear I have is nothing but junk, the only machine in my arsenal of printers that is 100% reliable, is my jail broken K1C, so my fix for all these issues is to just carry all of the machines I have outside to the verge for roadside collection, and just purchase a couple more of the K1’s as they just work.

I have had all of the multi MCU issues found in this forum and the hardware I have been using is the following:

MKS PI
Pi3 A and B
PI4
PI5
THR36
MKS-SKIPR
BLTouch
BTT Microprobe
Bed Distance Sensor

Issues persist with both 64bit and 32bit operating systems, both Mainsail OS and Raspian Lite.

Issues persist with multi MCU in USB mode and CAN mode

No transmission errors have ever been seen or logged when these shutdown events happen.

I assume that there is a good hardware combination using multi MCU systems, just not my combination as the K1C demonstrates this with its very resource limited boards, so it is not a CPU performance issue or anything of that nature.

I have exhausted all options now, and my time is worth to much to me to continue messing about with this junk hardware, I will be very selective with the hardware choice for my latest prototype printer design as will not be using MKS gear again, I will look at using BTT or Mellow instead.

So just as I was typing this, I had another crash:

Heater extruder not heating at expected rate
See the ‘verify_heater’ section in docs/Config_Reference.md
for the parameters that control this check.

The stupid thing is, the extruder heater did not faulter, it stayed perfect right up to the shutdown, it was the bed heater, the PWM went to zero, and the bed dropped from 100C to 90C and the machine went into shutdown, just mental, and there is nothing wrong with the bed, or the heater, I can set it to 100C and leave it for hours, there is no dry joints or bad connectors, this failure only happens while printing, I have another machine doing this exact same thing, and I actually thought it was the extruder heater, I had replaced it twice, then soldered the wires directly to the THR36 just in case it was the JST and screw connectors, but it turns out, there is no fault with the extruder heaters.

I wouldn’t go that far, I use an MKS DLC32 in my CNC machine and it runs grblHAL very well. That said, perhaps they just aren’t suited to running a Klipper CanBus based system.

Most folk I know running CanBus, are using BTT CanBus boards without any issues, so perhaps they are the most reliable option.

I was a reseller for Makerbase, they do make some nice gear for the price, documentation is extremely lacking, but I could say the same for other brands.

I have done a few builds using just the SKIPR board with built in SOC on its own and they are fine, its everywhere I have gone with a multi MCU setup that I’ve had issues, the common denominator is the THR36 with the RP2040, my next build I have designed the Toolhead to use the Mellow board that comes with a really high quality CAN cable, I just don’t think I will be using a SKIPR board though, I will go with the best I can get from BTT or Mellow and take the punt, I prefer to spend my time creating in CAD and printing my designs than chasing down weird software/hardware bugs, so much of my time has been wasted over the past 12 months, so much so, I have not been able to finish of some of my major projects as I really want to publish them for others to download and build, mainly a large format high performance printer, a multi material system and a auto retracting sealed filament box, and little time available.

1 Like

Just a question: Has anyone tried the USB to CAN adaptors on USB 2 (black) or USB 3 (blue) of the RPi 4 to see if the issue is the port selection because the 3B+ only has USB 2 ports?

Even though the BTT U2C uses a USB-A to USB-C cable between it and the RPi, it doesn’t mean it can use USB-C speeds (it is just a connector); it could be using USB-A 2.x or 1.x speeds. RPi’s 4x and 3x all have USB-A connectors, not USB-C for external devices.

The RPi 4 are at least notorious for having very noisy USB 3 ports. You can easily kill your WiFi by placing a RPi 4 beside the router and having a USB3 SSD attached.

This is known for USB3 ports in general but it seems especially bad for the RPi 4s. I have made this experience firsthand myself by thinking to use a Pi4 as basis for a Zigbee controller.

Don’t know if this might have an impact on a CAN line as well.

All though I don’t use the U2C boards as this feature is integrated into the SKIPR boards, I have tried, both USB 3.0 and 2.0 ports, both fail equally with my hardware, there is no difference, this for all my PI’s including the MKS PI.

Things that cause these multi MCU errors to show up more often are, having more than 1 web interface connected to the same machine at the same time (lan or Wifi), having a HDMI/USB touch screen attached, having a USB camera attached.

Things I have tried are industrial powered USB 3.0 and 2.0 hubs, many different low and high quality USB 2.0 and 3.0 cables, short and long.

The best and most reliable results I had was when I used a Dell Optiplex USFF Intel PC with an i5 CPU, this was very stable, I was also using all MCU’s in USB mode not CAN, but it had other issues related to the MCU’s not detecting on bootup and required me to use a funky combination of CH340 USB relays with scripts to cut and reconnect the USB power on boot up, it was problematic, and not reliable on its own, but once running, was solid, OS was Ubuntu server.

I wish I had kept copies of my logs as I’m sure someone could have made some sense of it, but I have wiped and reinstalled multiple times on multiple machines now, all I can offer is second hand information and my experiences as I have a crap ton of hardware, now collecting dust.

I would like to chime in here, i have recently added a EBB36 board to my printer connected via CAN into a BTT Kraken built-in CAN ports. I am having about 70% failure rate on prints since and found this thread trying to find an answer. As mentioned here, i am running the 64bit lite OS, all installed with KIAUH.

I am going to switch to the 32bit version of the OS and report back. I found these threads before landing on this one.

1 Like

How did you go?

Reporting back, i reinstalled rpi OS using the 32bit lite, installed klipper using KIAUH as before, same setup hardware wise, did not change or touch any wiring in any way.

I can report that the issue for me is 100% solved (so far several short prints, about 3 hours total) and have not had a single shutdown or communication issue.

Also, checking klippy logs i am now seeing ZERO invalid bytes and retransmit bytes. Before, even at the very start of my prints i was seeing 30-40 invalid bytes and would climb into the thousands before eventually shutting down anywhere from 30min to an hour into a print.

It appears that the 64bit OS is not fully compatible with CAN, at least not with my setup.

Setup:
Rpi 3B+, BTT Kraken, BTT EBB36. CAN connected using the Kraken’s built in CAN port.

1 Like

Every now and then, I get a timeout during homing, it’s very random and I can go for many prints without seeing it. So to test your theory, I’ve ordered a v2.1 U2C.

I’ll let you know if that eradicates the issue entirely.

Here is a link for a know stable firmware for the STM32G0B1.

https://www.dropbox.com/scl/fi/z3qsrfktste1emz4v6v4n/G0B1_U2C_V2.zip?rlkey=3lnqgjj7do5yqv996ccp537vl&st=gnmkp3u0&dl=0

1 Like

Cheers, I’ll try that for sure.

1 Like

I got the v2.1 up and running without any issues - that FW flashed perfectly too. I’ll let you know if the infrequent timeout issue comes back.

1 Like

You were right, the STM32G0B1 equipped U2C is more stable than the v1. I haven’t had a single issue since upgrading - I’ve been printing multiple items per day, and It’s been flawless.

1 Like

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.