Fill out above information andin all cases attach yourklippy.logfile (use zip to compress it, if too big). Pasting yourprinter.cfgis not needed Be sure to check our “Knowledge Base” Category first. Most relevant items, e.g. error messages, are covered there
Describe your issue:
I’m having recurring situations where Klipper goes into shutdown for issues that feel relatively minor or at least recoverable.
Typical examples:
– A damaged temperature sensor immediately triggers shutdown.
– A small extruder fault also results in a shutdown.
– A brief sensor glitch stops the system entirely.
The result is that the print is lost and the machine becomes unusable for any maintenance movement until everything is restarted.
Coming from Duet, I found its behaviour more forgiving — it would pause, let you intervene, and continue safely.
My questions are:
1. Is there any supported way to configure Klipper so it enters a recoverable state (pause, or similar) instead of a full shutdown?
2. Does Kalico implement a different error-handling strategy that might help in these cases?
And additionally:
3. If this behaviour is hardcoded in Klipper, is there any guidance on which parts of the codebase would need to be modified to change this logic for a custom machine?
I don’t mind adjusting the firmware for my setup if that’s the only path — I just want to understand where the shutdown decisions are made and whether they’re configurable at a low level.
Any pointers or documentation would be greatly appreciated.
Thanks
This is a serious fire hazard. Klipper shuts down to turn off the heater in an attempt to prevent a fire. You need to check all your wiring connections and replace your thermistor if necessary.
What do you mean “small extruder fault”? A TMC error? A properly configured and wired setup should not shutdown spontaneously.
Which sensor? If this is a temperature sensor, it’s a fire hazard (see above). If it’s a filament sensor, something isn’t configured right (at worst it should just false-alarm pause the print).
If Klipper is shutting down that much, you should really reassess and fix your hardware, and verify your configuration is ok. If you could provide a klippy.log when these shutdowns happen, that would also help a lot.
I’m not having many errors since the machine is good enough, but it’s true that sometimes I’ve found the printer stopped, and it’s a shame that I couldn’t move it or do anything because it always goes into shutdown mode. I’d really like to be able to decide whether the printer should shutdown or follow a procedure that I consider appropriate for each case. When I worked with Duet, for example, I could control all of this. If a PT1000 failed, that heater was simply disabled, but the printer did not stop entirely.
What I really want to say is that I think shutting down for every single error —whether printing or just operating the machine— is not the best way to handle things. I understand this is an easy way to make Klipper compatible with all printers, since every error would require custom handling, but still… it would be great if experienced users could decide how the printer should react in certain situations, and if not, then let it shutdown.
With that in mind, I’m interested in knowing where I could modify at least some parts of the code to customize this behaviour for my machine, as I don’t mind maintaining these changes myself.
Regarding the risk of burning my machine and similar concerns, I’m honestly not too worried. My machine includes multiple layers of physical safety protections, such as thermal cut-offs, external drivers with their own fault detection, and other independent safety systems.
In other words, I have software-level error handling (whether it’s Klipper, Kalico, Duet, or whatever firmware I end up using on the machine) and a full set of external, hardware-based protection mechanisms. Because of this, I don’t really mind if the software does not enforce such strict error handling as Klipper currently does — I’m already protected by redundancy at the hardware level.
This is an industrial machine. I’ve intentionally put much more emphasis on external and auxiliary safety systems than on software-only protections, which is quite different from most hobby or consumer printers where everything depends entirely on the host or firmware.
That’s why I strongly believe that, under no circumstances, the machine should power down or trigger a Klipper shutdown. Instead, it should always try to pause, recover, or otherwise save the print whenever possible.
You’re not the first person to question Klipper’s approach to error handling.
I do have a question back to you, if your printer is an “industrial machine” then shouldn’t its reliability be at a level where basic failures (like the ones listed above) are essentially non-existent which means how errors are handled by Klipper immaterial?
In my own case, I have a Voron 2.4, three custom printers and an old Sapphire Pro that I use for experimenting with. While the mechanical hardware is basically what I could buy as cheaply as possible, I do focus on the electronics to ensure I have good quality parts and I make sure my cabling is of the best quality, properly terminated with strain relief.
None of the machines are less than three years old and I haven’t had a hardware (electrical, mechanical or wiring) problem causing a print to fail this year (I honestly don’t think I had any problems last year either other than replacing the Micro Swiss extruders/hotends on my custom machines to ones of my own design). The only maintenance I’ve done this year is to oil linear rails, replace nozzles on a couple of the printers as well as replacing a PEI spring print surface that has worn out. In any case, I have about 4k operating hours on the various printers this year without any of the errors that you list in your OP. Based on experience I’ve had with previous printers, I don’t expect problems until I reach 10k+ hours at which point it’s time for a new printer or rebuild of mechanical parts and maybe upgrade some electronics.
So, with my modest farm, I don’t think about how Klipper handles errors as I haven’t experienced any and don’t expect to except if something has worn out, in which case, I want immediate notification even if it results in a lost print.
Again, my question is, in this machine you’ve put a lot of time and effort into with the result being a high quality printer, then shouldn’t the way Klipper handles errors be immaterial?
It’s not so much about the errors I’ve experienced so far. In fact, most of the issues I’ve seen have been on my demo machines, where it’s true that I’m constantly modifying things, testing hardware, reconnecting components, and so on, so cabling and connections are naturally more exposed and fragile than in a closed, production-ready system.
My point is more about thinking ahead. Even if these kinds of failures should be essentially non-existent in normal operation, and ideally never appear, I still think there should be a more nuanced way to manage them when they do occur. To use an analogy: a car also relies on many subsystems, often communicating over CAN bus, and while under normal operation nothing should ever fail, at some point something eventually will. If a car behaved the same way Klipper does today, a minor fault could leave you stranded in the middle of a highway. Obviously, if the main ECU fails, a full shutdown makes sense. But for secondary or non-critical subsystems, there should be a way to decide whether the system can continue operating in a degraded or safe mode, and how that situation is handled.
In my case Klipper is the base firmware I use for my machines, and I already modify it and extend it for my specific needs, adding extra modules and machine-specific systems. What I’m really interested in is understanding how and where to modify this behavior so that, at least on my own machines, I can implement a more robust and deliberate error-handling strategy.
Naturally, I fully agree that the first priority is always hardware quality, wiring, and proper integration, long before touching error handling in software. My point is simply that, even if hardware reliability is the foundation, adapting Klipper’s error management is still a necessary step at some point in order to finish the machine in line with how I believe an industrial system should behave.
This is not a valid analogy. Neither a car nor any other kind of vehicle isn’t a 3D printer, nor is it an “Industrial Machine”, which is what you identify your 3D printer as.
A better analogy would be a “pick and place” machine used to put components on a PCB before soldering:
Failures of the type you describe result in the machine shut down and the work in progress in the machine scrapped or salvaged. When the failure is resolved, things start again from the beginning.
Going back to your original post, you cited:
I don’t think you have thought through what happens in these cases and the magnitude of work required to the printer to enable it to return to the point of the error and continue on without issue.
For the first two examples, the repair action will require that hardware will be removed from the printer after shutting off power to the printer. A defective thermistor/thermocouple will require shutting down the printer so the hot end or the build plate can be removed. After that’s done, any changes in the position of the repaired subsystem will have to be measured and integrated back into Klipper. If the hot end is pulled, then the new Z-Axis offset of the nozzle will have to be measured and, if it’s the build plate, then it will have to leveled again and the Z-Axis offset of the build plate will have to be measured. Once this is done, you’ll have to restart the printer and then re-home the printer to the exact same conditions it was in before the error occurred - this needs to be done to within a few microns with a partially printed model in place.
I’m not sure what you mean by your third example as, when Klipper (or any other 3D printer software) is running, it’s only monitoring the temperature sensors.
So, before you can consider modifying Klipper, you will need to redesign your printer to allow very precise homing operations with a partial model in place as well as the ability to take out and put back a partial model in precisely the same place as it was during the initial print.
There is no global “ignore errors” setting in Klipper. For shutdowns, it is possible to review the software for each event to determine if some other action is appropriate - you can find all the mcu invoked shutdown events by running git grep -w shutdown src/ . You should be able to find all shutdown occurrences in the host code with git grep -Ew 'invoke_shutdown|invoke_async_shutdown' klippy/ .
In addition to the above, sometimes “gcode errors” are confused with “shutdowns” (the former stops a print while the latter also turns off heaters and motors). Reviewing all errors is harder, but you should be able to get an overall list with something like git grep -w raise klippy/
As mentioned above, we’ve had reports in the past from people that “really didn’t want to get an error”. We definitely also don’t like getting errors on our printers either. There’s some additional information on this at: Frequently Asked Questions - Klipper documentation
One issue I have run into during testing is that shutdowns make it much harder to put the machine in the optimal state for maintenance or makes testing harder than necessary. For example, let’s say that thermistor broke and I want to move the toolhead to the position that will make replacing it easiest. Now I have to rehome the machine even if the position wasn’t affected by the failure. If we only disabled the heater, I could simply move the toolhead to the optimal position and then shut the printer down to perform the maintenance. The same for issues caused by auxiliary sensors failing.
Even if we’re discussing testing here, the same situation happens in normal operation too: a simple maintenance task like replacing a thermistor becomes much harder because once the printer is in shutdown you can’t move the machine to a convenient position.
For now I’m trying to implement this in my own code so the machine behaves the way I want (under my responsibility), but I’d really like to see something like this supported upstream in Klipper. For example, per-heater options such as:
[heater ...]
sensor_pin: ...
shutdown_on_error: true/false
on_error_gcode: ... # (gcode to run if shutdown_on_error is false)
@JuanR3D Regarding your attached klippy.log. Are you aware that your EBBCan_1 mcu has the dirty flag? Loaded MCU 'EBBCan_1' 125 commands (v0.12.0-452-g75a10bfca-dirty-20250306_195337-IDENTITY-003 / gcc: (15:8-2019-q3-1+b1) 8.3.1 20190703 (release) [gcc-8-branch revision 273027] binutils: (2.35.2-2+14+b2) 2.35.2)
Also this Git version: v0.13.0-155-ga9290a49-dirty
Seems to be an inconsistency between the different mcus. Might be root for your problem?
Yes, I’m aware of the “dirty” flag. In my setup I have mcu, mcu EBBCan_0 and mcu EBBCan_1.
If I’m not mistaken, EBBCan_1 is a toolboard I moved from another machine some time ago (my original one failed, so I temporarily swapped it in). That other machine was compiled/flashed back then with a different Klipper version, so it makes sense that it’s reporting a different git revision.
Regarding the “dirty” state: that doesn’t surprise me either. At that time my repo organization wasn’t clean as I just had some changes in klipper in that machine without any commit or updates to my own repo. Right now I’m actively working on moving everything to my own referenced repo so the builds are consistent and the “dirty” flag is expected to disappear once everything is properly tracked.
That said, I agree that having inconsistent Klipper versions across MCUs can definitely cause weird behavior, so I’ll re-flash EBBCan_1 using the exact same Klipper version/build as the rest of this machine to keep everything aligned.
In any case, in this specific situation I believe the shutdown I’m seeing was caused by a communication failure due to a bad contact/connection. This is a demo machine that I’m constantly working on and reconfiguring every day, so it wouldn’t be surprising that something was slightly loose.
I’ll fix the connection, re-flash the board, and re-test to confirm.
Anyways, thanks a lot for spotting this — I honestly hadn’t noticed these details before, so I really appreciate you pointing them out.