Timer too close, multiple fans

Basic Information:

Printer Model: Climber 8A
MCU / Printerboard: STM32F407
Host / SBC: BTT Pi v1.2, BTT Pi2

klippy4.log (755.3 KB)

Describe your issue:

Long story short, I get this error, “timer too close”. I’ve already read this thread Timer too close
I have already excluded almost everything listed there. Points about

  • USB, UART or CANBUS wiring faults leading to extremely delayed messages
  • Electro-Magnetic Interference (EMI) affecting proper signal transmission or leading to high resend rates

are also unlikely, bytes_retransmit is low in logs. I tried both via USB and UART - exactly the same behavior. Graphs show no evidence for system being overloaded, no other USB devices are connected, etc. Even the host itself is excluded, I’ve tried two different SBCs, one of which I’ve had for a long time working with another printer without the slightest problem - here the problem shows up on both. It shows up on prints with low layer heights and lots of movement. The problem is reproduced perfectly on 0.08 thickness with fuzzy skin. It looks exactly the same: host buffer spikes to 20+%, followed by a bandwidth spike to 250% or more, after which the print stops with a timer too close error. But it doesn’t make any sense. The graphs show that this behavior occurred several times before the error, but did not necessarily lead to it. EM interference is also unlikely, the cable is outside, away from high-current wires, and the error both via USB and via UART occurs at the same approximate time and with exactly the same pattern on the graphs. The problem is clearly in the board itself or in the firmware, but I can not understand what could be wrong there. The board itself is proprietary from XTLW, based on STM32F407. Any idea what could i check and what should i look for?
The only very weird thing is: Like, i’m seeing mcu bandwidth way over 100% through it is connected through UART and baudrate set to default 250000
Then how bandwidth can be above 100% over uart?
Graph scales 250000 to 100% soo…

At the moment I’m working through the theory that it’s a clocking issue. The quartz used is 8 MHz (at least the label on its package says so).
Looking at the code:

I see
RCC->PLLCFGR = 0x24003010;
Meaning PLLM is 16, so it’s output is 0.5 MHz, but according to reference manual, that’s an illegal value for 8 MHz HSE input:

The software has to set these bits correctly to ensure that the VCO input frequency ranges from 1 to 2 MHz.

I don’t know how exactly make menuconfig generates code, but I don’t see any places where the PLLCFGR register would change. Could this be the cause of the problem?

Try rolling the firmware back to commit 81de9a861 and see if that makes a difference.

No luck, still same
klippy05102024_01.log (1009.1 KB)
It still looks totally weird that the https://sineos.github.io/ shows the MCU bandwidth higher than 100%

You using USB so, actually, there is “no limit” on bandwidth.
Here are the topic with some tests and explanations:

Not really, i’m just using USB-UART bridge instead of on chip UART for testing, but the MCU itself uses UART
изображение

Here’s also klippy.log for this problem when connection is over USB.
klippy.log (939.6 KB)
It’s exactly the same pattern

Also, could stepper motor drivers be causing this? Forgot to mention, that board uses gc6609 stepper drivers. They are not supported by klipper, but appear to be tmc2209 clone, so they are working with such config

It is really looking like Klipper just try to send to much small commands to MCU.
It just feels wrong:

Maybe the resolution in the slicer is too high.

It is hard to say what exactly arrives to late to MCU, what trigger this error. Maybe the PWM update just arrived too late.

Yes, g-code resolution was set to 0.008 in slicer. Reduced it to default 0.0125, however issue is still present, roughly at the same time with mostly same looking graphs.
klippy05102024_03.log (472.5 KB)


Looks exactly the same.

You can also just slow down your print, that should also help.

That’s a very normal situation with fuzzy skin.I have absolutely printed with similar settings on another printer (Btt pi v1.2 + btt skr mini e3 v3) and with speeds and accelerations much higher than now. And never once encountered this error. Why is it suddenly occurring here?

Issue seems to be resolved by
sudo renice -7 -g (klippy PID)
and

[mcu]
 baud: 750000

I don’t know which one of these worked specifically. Don’t close the thread yet, I’ll gather stats to see if the problem returns on longer prints

klippy06102024_01.zip (2.2 MB)
Can we at least figure out if our host is slow for some reason with sending, or is it the MCU clock suddenly flying forward? The host can’t be the cause, I’ve had weaker H616s process far more detailed gcode than this without the slightest crashes, including with crowsnest and webcam. Maybe build the firmware in debug somehow? Monitor something in it?

Seems to be related to recent changes in fan.py. Reverted to version before theese changes, now problem is gone.
Recent commits say this problem has been fixed, but apparently not. Maybe it’s because I have 5 fans instead of the defaultish 3 fans, and it wasn’t tested with such conifguration

Please be more specific: From which commit to which commit did you revert?

If possible please provide a log that:

  • Uses latest current git
  • Contains no other modifications, i.e. not marked as dirty
  • Issues a M112 immediately after the event
  1. v0.12.0-316-g96cceed2 - current master. TTC occurs after ~10 minutes in print
  2. v0.12.0-291-g293858c5 - suggested in thread mentioned above. TTC still occurs, but now after 3 minutes, same gcode.
  3. v0.12.0-0-g0d67d9c4 - latest release, ttc after ~3 minutes
    ttc testing.zip (4.6 MB)

This is getting downright weirder. Timer Too Close now appears in all cases, but at slightly different times. However it didn’t happend on g293858c5 yesterday, i tried several times.
In none of them is there any suspicion of too much load on the system, or on the disk. I ran sar -u during print, it did not show a large load anywhere either, only %iowait goes to 0.52% from 0% at rougly 1 second before i get TTC. Looking at logs themselves, the cause is one single queue_digital_out command sent 100 ms or less too late. Other commands are sent roughly 100 ms ahead of their planned clock. Looking at the prints, it happens not exactly, but at layer change or very shortly after it.
The Gcode causing this problem is attached along with the logs, print.

Yes, this gcode is a bit extreme and more like a benchmark, but I have quite detailed prints where I catch this error in real world conditions after more than 6 hours of printing. Also, XYZ calibration cube sliced with these same settings does not result in a TTC error, which is confusing

I looked through the logs you posted. All of them seem to indicate classic host overloading problems. Specifically, the four shutdowns I traced all indicate that the host tried to schedule an update to a heater (bed or extruder), but failed to do so in time. The host has 300ms to respond to a temperature update from the mcu, and the shutdown resulted after the host failed to respond within that time limit.

I don’t see anything in the logs indicating a problem with fans. I don’t see anything in the logs indicating a problem with communications. I don’t see anything in the logs indicating a problem with the mcu.

The “timer too close” knowledge base thread (that you cited above) has common reasons for this problem. But, you might want to double check that your host cpu is powerful enough, that there aren’t background threads (such as webcam processing) stealing cpu cycles, that the sdcard (or equivalent) isn’t having issues, and similar.

Cheers,
-Kevin

The load graphs definitely show an issue - ./scripts/graphstats.py klippy_g96c.log -s -o sys.png :

The red spikes indicate the cpu usage of the klipper process itself, and it is not normal for Klipper to use ~ %75 of a core. (Typically klipper stays well under %20 of a core.)

Possible reasons are crazy slicer output (that is swamping Klipper), and/or it is possible the host cpu is underpowered.

-Kevin

Thanks for looking into my logs and clarifying what’s happening.
CPU is powerfull enough and it is not overheating. The problem turned out to be macro M106. With cooling turned off no errors appeared during test prints. I copied someone else’s code for IDEX without much attention to its code. After replacing it with a simpler macro from here, the problem disappeared.

I’m a little surprised I’m the first one to encounter such issues. There is no official example of M106 configuration for multi-extruder printers, so everyone writes their own macros and, as it turns out, some of them are too stressful, especially in combination with adaptive cooling, when the slicer issues a lot of these commands in a short period of time.
I suggest adding computationally expensive macros with lots of conditions and variable accesses to the list of potential causes for this “Timer too close” error