/extras/fan.py adds 100ms between fan commands if they’re too close together, which in extreme cases can lead to fan commands being several hours out of sync with move commands, and the fan cycling up and down at 100ms intervals well after the print has ended. This impacts print quality because cooling doesn’t happen where expected.
Timer too close errors
The much more concerning issue is that this leads to some sort of clock error (rollover/race condition?) eventually causing a “Timer too close” error on the EBB, killing the print.
This seems to have some sort of random-chance-ness to it, with failures taking 12-18 hours in real-world prints, though always in a region of dense-moves-with-fan
I’ve attached a synthetic gcode file that always fails instantly for me, I’m wondering if anyone else can reproduce the issue.
Workarounds
Obviously getting the slicer to calm down with the fan commands will mask the problem
Setting FAN_MIN_TIME = 0.0 doesn’t seem to fix the timer too close error issue, but does resolve the fan scheduling weirdness
That’s a bit of an oversimplification.
Let’s first have a discussion whether it’s a good idea for any slicer to be changing the fan speed this often (it’s not hard to imagine that this could improve print quality).
Even if it’s decided that this is slicer behaviour is not needed or incorrect then this est case is still a good opportunity to improve the current code.
@mykepredko Dang I was shooting for a perfect score! Right now I can reproduce this in PrusaSlicer and PantheonSlicer, haven’t tried with OrcaSlicer. Profiles are here.
@Sineos Thanks for pointing me to that issue. Seems like it could be the same thing.
@Mard0 I agree this fan cycling has no benefit to the parts, it’s probably making support interfaces worse.
I totally agree that the slicer is the immediate fix.
My bigger concern is that this timer issue kills prints randomly, and isn’t one of the host-hardware-related causes. Could be present in a lot of the mystery Timer too close failures that pop up.
If there is an underlying issue in serialqueue or similar I’d definitely think we should address it.
There are surely a lot of areas where code can be improved. Ultimately, it comes down to the question of whether to spend developer resources to mitigate questionable gcode.
Looking at an offending gcode file, you will find a lot of areas that look like:
I seriously doubt that this has any positive effect on print quality. On the contrary, it seems rather pointless and IMO the slicer is simply not doing its job: Simplify a stl into gcode commands that have a practical meaning. Instead, it just pumps out commands that stress the entire system, including the fan hardware.
Edit:
Just to translate this snippet: For a printing move of 0.017 mm the fan speed is changed 6 times
The gcode file is from the above-mentioned GH issue and sliced with latest Orca Slicer.
To me, it looks like there is a conflict between the minimum (or maximum) fan_speed and the “overhang” fan_speed, which I’m guessing is the fan speed when you need support (ie for the “overhang”).
I don’t know PantheonSlicer but it seems to me that this is where you should be looking as I think you have parameters that are in conflict along with a software bug that adds the gcode for the default fan speed before changing to the required fan speed for the overhang/support.
IMHO the nozzle moved faster than the fan accelerated to the higher RPM…
This is a nice feature but it is quite useless for such tiny segments.
Does not make sense to work on that just to say it is resolved.
This is exactly the point:
Typically, the slicer’s job is to “optimize” / simplify the gcode in a way that hardware can follow the changes. It is comparable to gcode at a resolution of 0.0001 mm → Just a lot of traffic that will make the command queue collapse without any benefit as no hardware can follow such segments
I just saw your shout out to Kevin on this issue - I’m not sure it’s appropriate at this time as there’s still information required about the printer and the set up that @Alex-Pantheon is running.
I think we’re agreement that the PantheonSlicer is providing code which is sub-optimal and it seems like that is the case in a number of areas. However, I don’t think @Alex-Pantheon is doing anything unusual in terms of using the printer and when I look at the Pantheon printer web page, a couple of things activate my Spidey-Sense:
What I found interesting on the page is:
We use a fork of SuperSlicer/PrusaSlic3r that include tuned profiles for our materials in a range of print settings.
The printer is also advertised to use Klipper but the company doesn’t say whether or not it is also forked. My question back to @Alex-Pantheon is what modifications has he made to the Klipper that was installed in the printer; are you using the base Klipper that came with the prrinter? Have you made any other changes to the software that you are using?
I suspect this is a case where the slicer and the firmware have been tweaked to work together and accommodate each other’s quirks but @Alex-Pantheon has changed something which doesn’t include the accommodations. To be fair, maybe @Alex-Pantheon hasn’t changed anything but has updated something in the software chain that wasn’t properly checked out by Pantheon.
There needs to be a bit more understanding of the system before bringing in development.
EDIT:
Sorry, one more thing: @Alex-Pantheon have you contacted Pantheon about this issue?
@mykepredko lol I am Pantheon, I’m asking for input on an issue a bunch of our customers have run into.
Good catch on the profile stuff, I’m working on finding a permutation of settings that reduces the fan weirdness.
We’re running stock Klipper, I’ll let marketing know they should be more clear on that. At this point the slicer behavior is unchanged aside from where profile updates get pulled from, and some UI stuff.
The fan weirdness definitely exists upstream in PrusaSlicer, and apparently in OrcaSlicer as noted by @Sineos in the linked GitHub issue.
@mykepredko Definitely will work on the slicer side.
I’ve brought this issue up because I think this slicer behavior has exposed an underlying issue in multi-MCU timing/scheduling in Klippy, which I think affects all users’ reliability.
As far as developer resources go, I’m happy to contribute to fixing the issue with our team. I’m here to reach out to the community and see if anyone else has experienced this and start talking about how to fix it.
This is intentional. It is due to the way Klipper schedules events on the micro-controller. Fan updates (and pwm pins in general) are not currently queued on the micro-controller. So, in order to ensure each pin update has time to be sent from the host to the micro-controller, the code needs to ensure each pin update has a minimum application time.
It’s possible to rework the code to “collate the fan updates”, but it would be a lot of work and there would be no real-world gain from doing that work. Real world fans don’t change speed in less than a 100ms, so sending many fast fan updates doesn’t provide any value.
Another possibility would be to change the code so that it raises an error if fan updates become wildly out of sync with the toolhead movement (eg, 30+ seconds out of sync). That’s probably a good “sanity check” and would provide better debugging information to users, but I fear users may not like getting the resulting errors. I’ll have to think about that.
Your log indicates that the host CPU got overloaded:
Your log is missing the version and cpu information (I’m not sure why) so I don’t have any guesses as why the host got overloaded. It could be the result of many small commands that don’t consume much “toolhead movement time”. The log doesn’t seem to indicate anything wrong with the fan logic per say (other than the possibility that processing all those fan commands overloaded the host).
Separate from this log, I have seen cases where scheduling of fan commands way in the future (due to mandatory fan delays) could potentially cause timing issues because the host/mcu time synchronization code is not intended to schedule events a long time into the future. However, the log does not indicate that occurred here (the log indicates the host cpu got overburdened).
Finally, I’d recommend you update the slicer to avoid sending the command sequences causing this issue. Ultimately, if the slicer is sending crazy commands to Klipper then Klipper is going to struggle to get good results - the best fix is likely to get a sane command stream into Klipper.
I would be hesitant to agree with a hypothesis that states there could be any issue with multi-MCU operations based on what’s happening here.
Personally, I have never experienced a “Timer too close” error but, based on what I have seen here on this list server, it indicates that there is a problem in the system that doesn’t involve Klipper. It’s usually an overloaded host system (usually occurring in a system that has less computational power than a Raspberry Pi 3+), serial communications (USB, NRZ serial, CAN, etc.) that doesn’t have the necessary bandwidth or adding accessories to the host (a webcam being a big culprit).
Sending multiple fan and other gcode commands seems to be a new error type and needs to be resolved before looking to see if there are any issues with Klipper.
I would second this as I had similar behaviors where I was wondering why the fan was so late on changing.
But I had the other way round as well that the fan already changed RPM before it should.
Yeah, it was pretty clear to me that was the intention. I totally agree with that choice.
I’d want a warning maybe, but no shutdown. Killing prints sucks, especially when it’s caused by a recoverable problem that isn’t a safety issue.
The version info is my error, I trimmed the log down to start at machine reboot but evidently overshot. This test was run with on master at commit d725dfd on the host and both MCUs, will grab the complete log when I have a moment.
Thanks for the design insight @koconnor.
Seems like this file doesn’t actually test the issue. Organic, free-range gcode is the the way to go! I’ll keep an eye on that going forward.
Yep, at this point the problem feels like “hand hurts when grabbing thorns, devs pls patch”.
I’ll put my energy into fixing the slicer, thanks for clearing that up!