A few days ago, there was a USB communication glitch, it seems the klipper host can not read from MCU but still can send some commands to it. Then klipper detected the communication failure but it did not trigger the emergency shutdown. You know at the MCU side, if the communicaiton is broken over 3 seconds the MCU will shut down the printer. However, it seems klipper host did not receive the current temperature and it thought the nozzle is not reaching the target temperature, so it keeps sending PWM=1.0 to heat up the nozzle. Then I smelt the smoke and ran into the room, smoke all around. I attached the log below, I am wondering, if the usb communication is broken, the klipper host is supposed to shut the printer even if the commands could fail to send, but at least it should keep trying and hope for the best, right?
Sorry the log has been rolled over and I cannot find the earlier logs. I later heated it up several times to test the thermistor, even tried to pull it out and it seemed to be working. I cannot reproduce the issue, but my biggest guess is either usb cable or a linux driver glitch, the logs show that the communication was down for hundreds of seconds, and it seems that the host can send commands to the MCU, but not the opposite direction. So I checked D+/D- but the USB cable seems to be ok. I have no idea what happened during that period of time, but anyway, if the communication heat beat is one-way or failed many times, I assume klipper is supposed to try to send shutdown commands with best-effort, not PWM 1.0 full power command, right? I want to understand the expected klipper behavior first, then think about the solution.
Heater extruder no longer approaching target 250.000
Timeout with MCU 'mcu' (eventtime=230259.453739)
Transition to shutdown state: Lost communication with MCU 'mcu'
Dumping gcode input 0 blocks
Dumping 20 requests for client 4125215568
...
For the temperatures (bed and extruder) do not change at all, there might be something wrong with the printer board and/or the connection from the host to the mcu.
And if there is something in the way, it smokes.
Or the hotend gets way too hot due to a wrong thermsitor_type setting.
Thanks for the help. The thermistor_type is correct and the machine has been running for a year without a problem. And the MKS Monster 8 board was not damaged, I tested the MSOFET and I can print actually. The hotend was really hot, and the silicone sock was totally burned out, and the bottom of the fan close to the hotend was deformed too.
According to the log, the MCU timed out and lost connection so Klippy never had a chance to send a shutdown signal. Once this happens, the MCU will typically shut down the heater within 5 seconds. I see in your truncated log that you adjusted the verify_heater settings since your fan was causing false triggers. I would reassess if you can reduce your check_gain_time closer to the default of 20s. Leaving a heater on uncontrollably for 60s (as you have in your config) is more than enough time to cause smoke in most hotends.
Thanks for the help. Yes, I will limit the max power to 0.5 and shorten the check_gain_time to the default value. Actually, I will make it even safer by using a PSU relay to shut everything down if the USB communication is broken, that should solve this problem completely. I asked this question just because I am very curious about the klipper design in this special case: in case the communication is one way broken (MCU can not heart beat the latest hotend temperature to klipper host, but klipper host can send PWM 1.0 command to the MCU), what could happen.
Host executes the callback for the ADC data (for example, update PWM).
New PWM goes to the MCU.
That is it. The only exception is temperature_combined, which works without all of this.
Otherwise, the MCU has a max_duration for heaters equal to 3s.
So, if, for whatever reason, the host did not send the new PWM within the next 3s, the MCU should go into the shutdown state.