Help fixing "Timer too close" shutdown

Basic Information:

Printer Model: Voron 2.4
MCU / Printerboard: Duet 3 Mini 5+
Host / SBC Pi 4
klippy.log

For months, I’ve been trying to fix an “MCU shutdown: Timer too close” error.

In the klippy.log I’ve noticed this at the end:

Receive: 84 12288.336200 12288.333920 14: seq: 19, analog_in_state oid=21 next_clock=2119417472 value=114
Receive: 85 12288.346191 12288.342983 15: seq: 1a, analog_in_state oid=22 next_clock=2120617472 value=23624
Receive: 86 12288.366193 12288.356345 15: seq: 1c, analog_in_state oid=24 next_clock=2123017472 value=30442
Receive: 87 12288.426180 12288.424125 15: seq: 12, analog_in_state oid=30 next_clock=2130217472 value=15218
Receive: 88 12288.626221 0.000000 15: seq: 12, analog_in_state oid=20 next_clock=2154217472 value=32342
Receive: 89 12288.636216 0.000000 14: seq: 12, analog_in_state oid=21 next_clock=2155417472 value=121
Receive: 90 12288.646216 0.000000 15: seq: 12, analog_in_state oid=22 next_clock=2156617472 value=23627
Receive: 91 12288.666221 0.000000 15: seq: 12, analog_in_state oid=24 next_clock=2159017472 value=30443
Receive: 92 12288.689510 0.000000 21: seq: 12, counter_state oid=4 next_clock=2126917472 count=2489295 count_clock=2126737472
Receive: 93 12288.726223 0.000000 15: seq: 12, analog_in_state oid=30 next_clock=2166217472 value=15223
Receive: 94 12288.926242 0.000000 15: seq: 12, analog_in_state oid=20 next_clock=2190217472 value=32344
Receive: 95 12288.936226 0.000000 14: seq: 12, analog_in_state oid=21 next_clock=2191417472 value=116
Receive: 96 12288.946260 0.000000 15: seq: 12, analog_in_state oid=22 next_clock=2192617472 value=23624
Receive: 97 12288.966237 0.000000 15: seq: 12, analog_in_state oid=24 next_clock=2195017472 value=30443
Receive: 98 12289.026315 0.000000 15: seq: 12, analog_in_state oid=30 next_clock=2202217472 value=15225
Receive: 99 12289.214211 0.000000 12: seq: 13, shutdown clock=2189677966 static_string_id=Timer too close

I’m guessing this means that from block 88 onwards, the data is being sent but not received, since the receive time is 0.000000 from then on. Is this correct or am I on the wrong track completely?

I didn’t have this problem for the first few months of the printer being configured. I’ve tried rolling back to older Klipper versions and thought they had worked, only for the problem to eventually return.

I’m using 64bit Klipper and I’ve read that some people have had success after switching over to 32bit. I’ve not been able to find any info on how I would do that. So I haven’t tried that yet.

Things I’ve tried so far:

Different shielded USB cables to connect the Pi 4 to the Duet 3 Mini

Using a USB power blocker to only send/receive data.

Re-crimping a potentially bad thermistor.

Checking the temperature of the Pi CPU.

Rolling back to older versions of Klipper.

Switching to different SD cards.

But none of these things have fixed the problem. I still get shutdowns after printing for a couple of hours.

klippy(7).zip (755.3 KB)

1 Like

Hmmm…

[mcu]
serial = /dev/serial/by-id/usb-Klipper_same54p20a_491E16B5484633532020204E393C18FF-if00
...
Stats 12288.1: gcodein=0  mcu: mcu_awake=0.023 mcu_task_avg=0.000010 mcu_task_stddev=0.000012 bytes_write=57665277 bytes_read=11779372 bytes_retransmit=763808 bytes_invalid=0 send_seq=1098495 receive_seq=1098495 retransmit_seq=1098443 srtt=0.000 rttvar=0.000 rto=0.025 ready_bytes=15 upcoming_bytes=4256 freq=119998931 sd_pos=19562992 heater_bed: target=105 temp=105.0 pwm=0.287 Chamber: temp=51.4 sysload=0.13 cputime=1336.079 memavail=548900 print_time=12295.799 buffer_time=2.184 print_stall=0 extruder: target=250 temp=250.3 pwm=0.000
Stats 12289.1: gcodein=0  mcu: mcu_awake=0.023 mcu_task_avg=0.000010 mcu_task_stddev=0.000012 bytes_write=57667933 bytes_read=11779973 bytes_retransmit=766259 bytes_invalid=0 send_seq=1098542 receive_seq=1098530 retransmit_seq=1098542 srtt=0.000 rttvar=0.000 rto=0.400 ready_bytes=9 upcoming_bytes=9130 freq=119998930 sd_pos=19566232 heater_bed: target=105 temp=105.0 pwm=0.287 Chamber: temp=51.3 sysload=0.13 cputime=1336.200 memavail=553656 print_time=12296.986 buffer_time=2.370 print_stall=0 extruder: target=250 temp=250.2 pwm=0.175
...
MCU 'mcu' shutdown: Timer too close
...
Dumping serial stats: bytes_write=57668297 bytes_read=11780241 bytes_retransmit=766962 bytes_invalid=0 send_seq=1098548 receive_seq=1098536 retransmit_seq=1098542 srtt=0.000 rttvar=0.000 rto=0.800 ready_bytes=2957 upcoming_bytes=5849
....

Seems like you suddenly have high bytes_retransmit, which leads to high rto (retry time) and as a consequence late delivery of something (PWM, Steps & etc).

:plus:
More or less true, it is possible under normal circumstances, but in your specific case, it seems like that.

This has nothing to do with Klipper. This is a sort of workaround for CAN network stuck issues.
For some time in the past. Now, for CAN, just use kernel 6.6+.

??? :smiley:
Ah, well, there is a jumper, and it seems like to use 5V from USB you should specifically select that.
So, normally it would have zero effect and we want a good GND connection, which generally could be provided by USB itself (between SBC and MCU).


As you can see from the logs it is a communication problem, so I hope it is obvious, that SD card, temperature & etc has nothing to do with that.

I’m not familiar with Duet/atsam/same70 MCUs. Seems like there are not a lot of specific changes since your last MCU flash. Generally, TTC has nothing to do with MCU firmware, and there is a problem on the host side.

So, I can only suggest looking at dmesg, maybe there are some messages.
Or, maybe there is a USB webcam, which can probably affect communication.

Just my guesses.

Thank you for your reply.

I’ve checked dmesg and found this at the end:

Control rx/tx
[ 20.004447] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 21.767142] ICMPv6: process `dhcpcd’ is using deprecated sysctl (syscall) net .ipv6.neigh.eth0.retrans_time - use net.ipv6.neigh.eth0.retrans_time_ms instead

Does this seem like it might be relevant? Or is this only relevant to ethernet?

Only to the Ethernet, and it seems harmless anyway.
On RPI4, unlike it predecessor, there are dedicated hardware for the Ethernet.
So, I doubt it would affect USB.

1 Like

I see what you mean with bytes_retransmit. Through the whole print there seems to be a few bytes dropped from 0-200ish. The increase in the last 3 retransmits are 749, 181 and then the final one is 2451 bytes.

Are the small amounts of retransmits throughout the print normal? As you can see over about 2 hours 763808 bytes were retransmitted.

It seems like that final loss of 2451 bytes is what triggered the shutdown.

Can this only be a problem with the USB connection? I don’t think I’m using any CAN connections. The only extra part I have installed is a Big Tree Tech filament movement sensor and switch. But that is currently disabled in software.

I installed the USB power decoupler in my desperation to fix this problem since I read somewhere that there can be issues with a ground loop. The Pi 4 is run from a 5v PSU and the Duet from a 24v PSU. I might try removing it to see if it changes the retransmit rate.

FWIW,
I can suggest you blindly update the kernel, assuming there is a host software issue.

About bytes_retransmit, it is pretty intuitive; a constant or zero value is good.
Occasional increase in several bytes is okay.
Continuous/every-second increase is not okay.

From your logs, it seems like there is a spike every 3-5 seconds, which is odd.

Ground loop means there is a loop.
USB power decoupled should not and does not decouple the common ground.

Like if you connect RPI by 2 USB cables to the motherboard, then there are 2 ground paths (GND), and it is a ground loop, which can (not will, can) cause issues.

Like for example, I power everything from 24v PSU, with help of 24v->5v step down converter, I power the RPI.
So, my RPI, has at least 2 paths to the MCU: USB and through the PSU GND, this is a loop.

1 Like

What do you mean by blindly update the kernel? Do you just mean using the update tool built into the web interface? Or do I need to use PuTTY or something else?

I suspect the retransmit spikes are EM interference from the nearby wires. I did change the USB cable for one which was sold as sheilded. It’s not very easy finding a good sheilded micro USB cable, since all the high speed stuff (that I can find) is USB C. I might try using a high speed USB C cable with an adapter, but that feels like it’s adding another point of failure.

Either that or it’s due to movement when printing. It does shake around quite a lot on the table it’s set up on.

Is there any way to run a comms test so I can check the retrasmit rate or do I have to just print something? It would be useful to be able to test without having the motors powered and the printer still to get an idea of what’s causing the drops.

~/klippy-env/bin/python3 ./klippy/console.py /dev/serial/by-id/usb-Klipper_...

You could try, but i don’t know if it helps with understanding here.

1 Like

I have rerouted the USB cable to try to keep it further away from some of the other cables like the mains voltage cable. I’ve been running a 0% extrusion 0°C test print of the same file. Previously it would always get to a point after 20% or so where it would start to drop packets and it would continuously drop packets until the machine was restarted. Since I’ve moved the USB cable. I’ve made it from beginning to the end of the almost 3 hour print with only 9 bytes lost.

Stats 10161.5: gcodein=0 mcu: mcu_awake=0.016 mcu_task_avg=0.000010 mcu_task_stddev=0.000012 bytes_write=42021593 bytes_read=9008160 bytes_retransmit=9 bytes_invalid=0 send_seq=805257 receive_seq=805257 retransmit_seq=2 srtt=0.000 rttvar=0.000 rto=0.025 ready_bytes=7 upcoming_bytes=1068 freq=119998854 sd_pos=18943293 heater_bed: target=105 temp=105.0 pwm=0.255 Chamber: temp=49.8 sysload=0.02 cputime=1063.362 memavail=546236 print_time=10169.360 buffer_time=2.343 print_stall=0 extruder: target=0 temp=43.1 pwm=0.000

To improve things in the long run I’m going to experiment with some ferrite sleeves.

Should I install ferrites onto the USB cable or the power and other nearby cables that might be causing the interference? Or both?

Edit: And as soon as I tried to run a real print I get the error again and the retransmit rate is as bad as ever. :sob:

I know you said you used SHIELDED wire, but did you connect the shielding (3rd wire) to a ground source? lots of folks miss the shielding wire and just go for the two - positive and negative wire connections. Reference: Grounding, Part 11 - Grounding of Shielded Wire & Cable - RSP Supply

1 Like

Hi Joe,

Thanks for your reply! It is a shielded USB cable so I would assume the shielding will be connected to the ground on the USB plug which would in turn be connected the the ground on the Duet 5 Mini.

Next time I open it up, I’ll check for continuity on the ground. Thanks for the reminder!

In the mean time I have a new USB cable and some ferrite sleeves to install. Althought I’m still not sure if I should install them on the USB cable or the power cables or both?

Seeing an incrementing bytes_retransmit when using USB is unusual. (Typically, with USB any line errors are retransmitted at the hardware level and the Klippher host software is not aware of it.)

For what it is worth, your symptoms seem more like voltage instability or overheating. You might want to quickly check some of the items described at: Timeout with MCU / Lost communication with MCU

Some other suggestions:

  • Try updating to the latest code and reflash the micro-controller with the latest code.
  • See if you can add a [temperature_generic host] config section with a temperature_host sensor ( Configuration reference - Klipper documentation ).
  • Add a [temperature_generic my_mcu] config section with a temperature_mcu sensor ( Configuration reference - Klipper documentation ).
  • Monitor the above temperatures during prints to see if there is a correlation between errors and temperatures.

-Kevin

1 Like

After changing the USB cable again and adding ferrite sleeves to the USB cable and several other nearby cables I’ve managed to complete one cold, no extrusion print then managed to complete a real 3 hour print without any retransmit issues. It might be coincidence, but I’m keeping my fingers crossed that it’s not!

I’ve kept at eye on the temperature shown under the “Machine” tab in Mainsail. It never goes above 50°C. I’m assuming this is the RPi CPU. I’ve been using the Duet 3 Mini for years and it’s never come close to overheating. I do have 2 80mm fans pointing right over both boards.

I have wondered about the 5v and 24v voltages though. On the Duet firmware I was able to monitor the voltages, but I’ve not found a way to do that on Mainsail/Klipper. I did once try turning off all the LEDs, which are also run from the same 5v PSU but that still would get the same error.

If this problem reappears, I will keep your ideas in mind.

Thanks to all that have helped!