'Timer too close' after CANbus conversion

Hello, I recently introduced CANbus in the form of BTT EBB36 and BTT EDDY DUO to my Voron Trident 250 running on BTT Octopus v1.1 and RPi3. I’m using the Octopus as a CAN2USB bridge for the can setup. EDDY_DUO-(CAN)->EBB36-(CAN)->OCTOPUS-(CAN2USB)->RPI3

Now, whenever I do a simple print (attached gcode/stl), I’m getting consistent and rising bytes_retransmit on the octopus mcu from 1min ish into the print, not the eddy nor the ebb36, leading to a shutdown around 2k or so with the error message “Timer too close” which occurs roughly 7 minutes into the print, but not always on the same layer, so I don’t think its because of the gcode.

What I tried:

  • Redid the CANbus harness 3 times, all of which I were using a shielded cable which I grounded on one side to the PSU.
  • Tried powering the ebb36 from either the octopus or the PSU
  • Tidied up the stepper motor cables, made them run as far from other data transfering cables and shortened them
  • Reflashed the whole printer multiple times, trying 250k, 500k and 1M bitrate for the CANbus.
  • Tried slicing the model in Orca, Super and Prusa slicer
  • Added a 24V 330uF capacitor to the ebb36’s power line
  • Tried a second ebb36
  • Tried different txqlens (10; 128; 1024)
  • Tried 4 different USB cables to connect the octopus to the rpi
  • Added aditional cooling to the electronic bay
  • No undervoltage in dmesg on the rpi

None of the above worked and I’m out of ideas, mostly…
I also noticed the MCU freqs fluctuating quite a bit, not sure if that is normal, the cause, or the symptom…

Another issue is that the eddy had consistently rising TX_RETRIES in the log, but when I monitored the stats live during the print, no packets were showing to be dropped and everything was looking healthy…

I was also monitoring the RAM, CPU usage of the rpi alongside with its temp, and none of those values indicated that the rpi3 was throttling under load.

I’ll try changing the extruder heating body, I ordered shielded cables for the stepper motors, I’ll ground all of the shielded cables to an earth bus, I’ll make a CANbus harness utilizing a sturdied connector than the microfit molex on the ebb36, redoing the CANbus harness for the 4th time… I also have an OrangePi R1 LTS lying around, so I’ll try using that instead of the rpi3.

At this point I think its either the octopus CAN2USB bridge not keeping up, or something in my config acting up, maybe sensorless homing, or the octopus board has a hardware issue.

I would be very appreciative of any helpful input, troubleshooting tips, etc. relevant to solving this issue. I’ll happily provide any other logs, configs, pictures, info, whatever should be pertinent.

(I can’t upload multiple log graph images as a new user, so just use sineos(dot)github(dot)io)
printer.cfg (18.1 KB)
klippy (4).log (6.6 MB)

1 Like

I’m sorry, I was limited to 2 attachments in the OP, here is the gcode.
Endstop-Y-no-logo_PLA_12m14s.gcode (1.3 MB)

Wierd. Honestly, I’m not sure what is happening.
It is not common to see high latency (RTO value).
It is a little weird that the USB CAN BRIDGE has issues.

I would suggest keeping qlen at the default of 128.

As far as my understanding goes, a USB CANBUS bridge should simply forward or discard messages. I would expect that boards on the CAN bus cannot affect the bridge itself.
But I can be wrong.

BTW, what about termination resistors?
IIRC, on octopus, there are good ones, and it is already present.
So, another one should probably be on the EDDY or EBB.

From the log, it seems like a CAN issue.
I would guess cable shields should be connected to the GND (DC GND), not to the earth ground.
Earth ground is nice to have, but it is a PSU thing here.

It seems to me like a congestion on the CAN network (high RTO, clock sync is resetting often). Reason is unclear.

This can add more information.

3: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc fq_codel state UP mode DEFAULT group default qlen 128
    link/can 
    RX:  bytes packets errors dropped  missed   mcast           
        654756   86880      0     215       0       0 
    TX:  bytes packets errors dropped carrier collsns           
        263675   35197      0       0       0       0 

Something like this can help: sudo tc qdisc replace dev can0 root fq (or fq_codel)
It can force the kernel to share bandwidth between boards.

My canbus config example
#/etc/systemd/network/can0.network 
[Match]
Name=can0

[CAN]
BitRate=1M

[Link]
RequiredForOnline=no

[FairQueueingControlledDelay]
Parent=root
Handle=1
#/etc/systemd/network/can0.link 
[Match]
OriginalName=can0

[Link]
TransmitQueueLength=128

Alas, this is not a solution, just a duct tape.

To monitor the status:

$ watch -n 1 -d tc -s qdisc show dev can0
...
qdisc fq_codel 1: root refcnt 2 limit 10240p flows 1024 quantum 16 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 
 Sent 550016 bytes 34376 pkt (dropped 0, overlimits 0 requeues 1801) 
 backlog 0b 0p requeues 1801
  maxpacket 16 drop_overlimit 0 new_flow_count 1801 ecn_mark 0
  new_flows_len 0 old_flows_len 0

Hope that helps.

Thanks for the help.

From what I understand the octopus has termination already present and the ebb36 has a jumper pin to enable the resistors. If I measure between CANH/L I get 60ohms on both ends and between CANH/GND and CANL/GND I get 120ohms, so I think that is in order.
Octopus:


EBB36:

From what I read on EMI mitigation (mostly from a Lincoln electric doc on cnc plasma cutters), shields should be drained on one side only to earth with as low resistance as possible, so that is what I did. Since then I also tried draining it to the octopus 24VGND/PSU 24VGND, none of which behaved differently.

I’m currently running the recommended txqlen of 128.
sudo tc qdisc replace dev can0 root fq didn’t help, in fact the print failed at 43% instead of the usual 60-70%

I also added

[FairQueueingControlledDelay]
Parent=root
Handle=1

to my network conf, but that didnt help either.

Here is the can monitoring from the last print:


(The 491 dropped RX was present before I started the print and didnt increase during the print)

From other issue tickets and threads I noticed a lot of instances of this error originating because of fault SBC storage (SD/eMMC).
I’ll try running the system with the OrangePi R1 plus LTS with a new SD card and a clean install to rule out the rpi3 being faulty and keep the thread posted.

1 Like

I was hoping this would help. Maybe you happen to have the log saved?
I’m curious how network stats are changing (If they do).

I started the qdisc log a little later than the klippy log, but they are of the same print, dont mind the 1 hour discrepancy.
tc_log.txt (91.8 KB
klippy (5).log (6.9 MB)

1 Like

Hmmmmm, I did reread all logs.
It seems that everything just works okay and then suddenly, there is a rise of RTO/RTTVAR (latency large/unstable), sometimes with retries.
It is still confusing that the error counters are at zero (maybe they do not work as expected? IDK).
There is +1 on the eddy can tx, but this can be a simple coincidence.

If I were to assume that software is okay and should detect CAN errors, there should be USB host issues (Dmesg?) then.

Some package rate shaping was happening (dropped, increasing), alas, that only means that it does detect congestion on the bus (which we already know about):

========== Sat  8 Nov 21:31:40 GMT 2025 ==========
qdisc fq_codel 1: root refcnt 2 limit 10240p flows 1024 quantum 16 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 
 Sent 3152048 bytes 197003 pkt (dropped 0, overlimits 0 requeues 1084) 
 backlog 560b 35p requeues 1084
  maxpacket 16 drop_overlimit 0 new_flow_count 1323 ecn_mark 0 memory_used 33600
  new_flows_len 0 old_flows_len 1
========== Sat  8 Nov 21:31:41 GMT 2025 ==========
qdisc fq_codel 1: root refcnt 2 limit 10240p flows 1024 quantum 16 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 
 Sent 3162304 bytes 197644 pkt (dropped 10, overlimits 0 requeues 1089) 
 backlog 480b 30p requeues 1089
  maxpacket 16 drop_overlimit 0 new_flow_count 1326 ecn_mark 0 memory_used 28800
  new_flows_len 0 old_flows_len 1
========== Sat  8 Nov 21:31:42 GMT 2025 ==========
qdisc fq_codel 1: root refcnt 2 limit 10240p flows 1024 quantum 16 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 
 Sent 3173904 bytes 198369 pkt (dropped 35, overlimits 0 requeues 1092) 
 backlog 176b 11p requeues 1092
  maxpacket 16 drop_overlimit 0 new_flow_count 1329 ecn_mark 0 memory_used 10560
  new_flows_len 0 old_flows_len 1
========== Sat  8 Nov 21:31:43 GMT 2025 ==========
qdisc fq_codel 1: root refcnt 2 limit 10240p flows 1024 quantum 16 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 
 Sent 3186304 bytes 199144 pkt (dropped 55, overlimits 0 requeues 1095) 
 backlog 192b 12p requeues 1095
  maxpacket 16 drop_overlimit 0 new_flow_count 1332 ecn_mark 0 memory_used 11520
  new_flows_len 0 old_flows_len 1
========== Sat  8 Nov 21:31:44 GMT 2025 ==========
qdisc fq_codel 1: root refcnt 2 limit 10240p flows 1024 quantum 16 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 
 Sent 3198608 bytes 199913 pkt (dropped 75, overlimits 0 requeues 1097) 
 backlog 224b 14p requeues 1097
  maxpacket 16 drop_overlimit 0 new_flow_count 1334 ecn_mark 0 memory_used 13440
  new_flows_len 0 old_flows_len 1
========== Sat  8 Nov 21:31:45 GMT 2025 ==========
qdisc fq_codel 1: root refcnt 2 limit 10240p flows 1024 quantum 16 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 
 Sent 3200976 bytes 200061 pkt (dropped 78, overlimits 0 requeues 1102) 
 backlog 0b 0p requeues 1102
  maxpacket 16 drop_overlimit 0 new_flow_count 1335 ecn_mark 0
  new_flows_len 0 old_flows_len 0

Something does not add up =\

If I were to assume that STM32F446, for some reason, does not report CAN errors, I would suggest trying this:

diff --git a/src/stm32/can.c b/src/stm32/can.c
index 223260b36..56d59b09c 100644
--- a/src/stm32/can.c
+++ b/src/stm32/can.c
@@ -193,6 +193,8 @@ canhw_get_status(struct canbus_status *status)
         status->bus_state = 0;
 }
 
+static uint8_t last_rec, last_tec;
+
 // This function handles CAN global interrupts
 void
 CAN_IRQHandler(void)
@@ -229,6 +231,14 @@ CAN_IRQHandler(void)
     uint32_t msr = SOC_CAN->MSR;
     if (msr & CAN_MSR_ERRI) {
         uint32_t esr = SOC_CAN->ESR;
+        uint8_t rec = (esr >> CAN_ESR_REC_Pos) & 0xFF;
+        uint8_t tec = (esr >> CAN_ESR_TEC_Pos) & 0xFF;
+        if (rec > last_rec)
+            CAN_Errors.rx_error += 1;
+        if (tec > last_tec)
+            CAN_Errors.tx_error += 1;
+        last_rec = rec;
+        last_tec = tec;
         uint32_t lec = (esr & CAN_ESR_LEC_Msk) >> CAN_ESR_LEC_Pos;
         if (lec && lec != 7) {
             SOC_CAN->ESR = 7 << CAN_ESR_LEC_Pos;

Just in case something weird happens with the LEC register state.
This is a crude solution, but if it reports zero in this case, that would only mean that the issue is somewhere above the CAN HW stack.


I want to add that: I only suspect the CAN HW specific code, because otherwise, if there is an issue in the USB_CANBUS bridge - everyone would have the same issues (even I’m), so I only suspect the specific CAN implementation in this case. This is specific to the STM32F0/F1/F4, which can have less attention in the wild.

Or, something above the USB CAN BRIDGE, otherwise, so the host or connection to it.
USB should share ground, so it should just work. If the SBC is powered from the same PSU as the motherboard (by a buck converter, for example), there should be virtually zero electrical issues

(This is approximately how the stack looks in my head)
klippy -> serialqueue -> kernel(can0 -> usb) -> MCU(USB_CANBUS -> can.c)

After applying the patch to can.c and reflashing the octopus, logs remain similiar (atleast in my amateur eyes). Also no dmesg errors and from the stats I could think of looking at USB and SD card health/errors also doesn’t show anything alarming.

printer@trident:~ $ sudo cat /sys/block/mmcblk0/stat
    5862     3056   401841    23997     1382     1558    29345    12160        0    13596    36158        0        0        0        0        0        0

printer@trident:~ $ lsusb -t
/:  Bus 001.Port 001: Dev 001, Class=root_hub, Driver=dwc_otg/1p, 480M
    |__ Port 001: Dev 002, If 0, Class=Hub, Driver=hub/5p, 480M
        |__ Port 001: Dev 003, If 0, Class=Vendor Specific Class, Driver=smsc95xx, 480M
        |__ Port 003: Dev 007, If 0, Class=Vendor Specific Class, Driver=gs_usb, 12M

printer@trident:~ $ sudo cat /sys/kernel/debug/usb/devices | grep -A10 "Vendor=1d50"
P:  Vendor=1d50 ProdID=606f Rev= 0.00
S:  Manufacturer=Klipper
S:  Product=stm32f446xx
S:  SerialNumber=3C0053000651313133353932
C:* #Ifs= 1 Cfg#= 1 Atr=c0 MxPwr=100mA
I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=gs_usb
E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
E:  Ad=81(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms

Attempted to run klipper from the orangepi, but forgot why I put it away (faulty ethernet port triggering errors, rendering it unusable without further repair), so I was unsucessful. I did however try a different, new V90 SD card and a clean new rpi lite os install as well as armbian (both minimal, headless). Changed the extruder heating body and added a bigger+smaller capacitor for filtering to the ebb36’s powerline (similar to this thread: New Voron instance lost communication with EBB (can bus) - #20 by regressor). Nothing worked so far.

At this point I’m at my witts end and I need this printer running again soon-ish, so I am considering buying the BTT Manta M8P V2 with the RPI CM5 4GB RAM and 32GB eMMC storage, replacing this setup and using it on another printer not running CAN (since that ran fine before).

You mentioned you’re also running canbus on your printer, I don’t want to de-rail the thread, but what setup would you recommend from your own experience?

klippy (6).log (2.4 MB)
dmesg.log (35.1 KB)
tc.log (259.7 KB)

1 Like

Yes. Alas, the issue looks the same.
I would only guess the issue is somewhere above the main MCU control.

In this case, you could try:
sudo rpi-update
If you didn’t update the firmware before.
(If there is an issue around USB and firmware, probably it is fixed).

Or maybe you could get lucky with SLCAN: add SLCAN by findlayfeng · Pull Request #6912 · Klipper3d/klipper · GitHub
It seems to be developed and tested with STM32F1, so it has a high probability of working on STM32F4.

git fetch origin pull/6912/head
git cherry-pick d86f28d2ed

(If it does not work with USB, we can probably hack around and bridge CANBUS over UART).

I prefer to do USB. Cause it simply works for me.
I just use RPI5 + Octopus Pro (H723) (USB2Serial) + EBB42 (USB2CAN) + carto board (CAN) right now.
I use EBB42 as the can bridge for the carto board to simplify wiring and free one USB port.
I haven’t even enabled the second termination resistor on the carto board.
So, to sum up, I generally use USB as default, and would resort to other solutions only in rare cases.

I’m running the latest firmware everywhere.
I will try CAN over serial and if that doesn’t help I’ll try using the EBB36 over usb as per this gist (MISC/PCBs/EBB_VIA_USB at main · hartk1213/MISC · GitHub). If all fails I will buy a new mcu and rpi.

If I find a solution, I will update the thread. Thank you for your valuable help and time.

Didn’t you say termination resistor is on EBB36? Is the above how things are connected? Termination should be on the first and last (physical) device.

I also looked through the logs. The logs seem to indicate that normally the connection between host and Octopus board has a round-trip-time of a few milliseconds, but then some kind of event occurs and the round-trip-time increases to around 35+ms. As a result, communications between the two devices becomes severely constrained and within a few seconds the system correctly transitions to a shutdown state.

I don’t see anything in the logs that would indicate a hardware error on the canbus links. The error seems to be something on the host, Octopus board, or the USB link between them.

I’ve not seen any similar error reports like the above. I don’t have any good ideas what could cause problems like you are seeing.

I noticed you were running Klipper v0.13.0-375-gba79d72f (with some modifications). Did this issue occur in older versions of the software? That commit (ba79d72f) did contain software changes to how the host communicates with the mcu. It seems unlikely it is related though.

-Kevin

Small update: Tried using the ebb36 over usb, still got the same errors. Also tried to get can over serial running, but I couldn’t get it to work properly. The new manta board and the cm5 arrived in the mail, I’ll try the new setup as soon as ill have time.

I connected everything up as per the official documentation I could find. Mainly the BTT wiki and the BTT github, https://bttwiki.com/Eddy.html#eddy-duo-ebb36 no mention of a termination resistor on the eddy, so I assumed it used the one on the ebb as it connects to the same CANH/L line on which the jumper is.

If you’re curious and want to debug this further I’m happy to help/test in whatever way I can when I get the time to. But I dont have the capacity to debug further than this by myself, so I’m going to be moving to a new mcu/sbc.

I dont know as I only used the v0.13.0-375-gba79d72f version at the time of setting up the canbus. But prior to moving to canbus no versions of klipper (since 2022 when this printer was built) threw a TTC error my way.