Advanced Trouble-Shooting / Graphing Klipper

Advanced Trouble-Shooting / Graphing Klipper

The klippy.log is key for trouble-shooting and diagnosing errors, containing an abundance of information and also low-level details. In Klipper’s tool-box are additional tools to facilitate such a diagnosis.

They are often targeted at developers with an in-depth knowledge about Klipper’s internal working, but the graphstats.py tool is also very valuable to diagnose common errors.

General Idea

  • Sharp spikes in such graphs in one or the other direction could point to an issue or “event”, especially when at the same time-stamp an error is recorded in the log
  • Exceeding given limits shortly or even over a longer period is a more strong indication that something is going on
  • To facilitate the graphing we have created a little webpage that allows uploading a klippy.log and creating the graphs: https://sineos.github.io/
    • Currently quite basic, e.g. no error handling etc.
    • If you are a HTML guru, any improvement is of course highly welcome

Graphing the Host

This section will demonstrate how to graphically evaluate the information about the Host that Klipper collected in its log.

Command

~/klipper/scripts/graphstats.py /tmp/klippy.log -o loadgraph_host.png
  • The klippy.log file either is in ~/printer_data/logs/klippy.log or /tmp/klippy.log
  • The loadgraph_host.png is generated in the same folder from which you are calling the script, when no dedicated path is provided, e.g. -o /tmp/loadgraph_host.png

Result

Interpretation

  • System Load or CPU load is shown on the left y-axis of the graph. It is relative to one CPU core, this means a 4 core CPU like the Raspberry Pi 3 is overloaded when reaching / exceeding 400%
  • System Memory is the free RAM of the host and shown on the right y-axis
  • Process Time is the CPU usage of the Klipper host process alone and is also relative to the left y-axis. For example 5% means that the Klipper process is using 5% of one CPU core


Graphing the MCU

This section will demonstrate how to graphically evaluate the information about the MCU that Klipper collected in its log.

Command

 ~/klipper/scripts/graphstats.py /tmp/klippy.log -o loadgraph_mcu.png -s
  • The klippy.log file either is in ~/printer_data/logs/klippy.log or /tmp/klippy.log
  • The loadgraph_mcu.png is generated in the same folder from which you are calling the script, when no dedicated path is provided, e.g. -o /tmp/loadgraph_mcu.png

Result

Interpretation

  • Bandwidth denotes the saturation of the communication line between Host and MCU. Scaling is based on a 250000 baud serial line. This means 100% here, when using a much faster USB connection, is not necessarily an indication for problems
  • MCU load is the processor load of the MCU. Values around 100% or even over do indicate an issue, for example a too high step rate
  • Host Buffer is an indication of the amount of data buffered in the serial buffer or the MCU buffer. Klipper tries to buffer movement data 2 seconds ahead. So, if during a print the buffer time dropped to only 1 second of data, it would show up as a spike to 50%. Most often this occurs with OctoPrint when it is not able to send commands fast enough.
  • Awake Time is the time the MCU is spending in “task handlers”. This is a very low level information targeted at developers


Graphing a Heater

This section will demonstrate how to graphically evaluate the information about the different heaters that Klipper collected in its log.

Command

~/klipper/scripts/graphstats.py /tmp/klippy.log -o loadgraph_extruder.png -t extruder
  • The klippy.log file either is in ~/printer_data/logs/klippy.log or /tmp/klippy.log
  • The loadgraph_extruder.png is generated in the same folder from which you are calling the script, when no dedicated path is provided, e.g. -o /tmp/loadgraph_extruder.png
  • The -t argument takes the name of the heater to graph, e.g. extruder or heater_bed

Result

Interpretation

  • XX Temp is the actually measured temperature of the heater. Printed on the left y-axis of the graph. Sudden drops, wild swings or gradual drops might indicate an issue with the heating system. If the temperature change follows a target change, then it is no issue but wanted control behavior.
    • Sudden drops often related to wiring issues or dead heating cartridges
    • Wild swings might point to improper PID tuning
    • Gradual drops could indicate insufficient heating power or high heat loss, e.g. through fans
  • XX Target is the commanded target temperature at the given time. Also printed on the left y-axis of the graph.
  • XX PWM is the PWM value with which Klipper is controlling the heating power to achieve the target temperature. Printed on the right y-axis of the graph.


Graphing MCU Frequency

This section will demonstrate how to graphically evaluate the information about the MCU frequency that Klipper collected in its log.
This already is a very low level information and usually not needed for generic trouble-shooting. For multi-MCU systems it provides an indication, if there are synchronization issues between the different MCUs.

Command

~/klipper/scripts/graphstats.py /tmp/klippy.log -o loadgraph_freq.png -f
  • The klippy.log file either is in ~/printer_data/logs/klippy.log or /tmp/klippy.log
  • The loadgraph_freq.png is generated in the same folder from which you are calling the script, when no dedicated path is provided, e.g. -o /tmp/loadgraph_freq.png

Result

Interpretation

  • freq is the nominal clock frequency of the MCU, as defined in the [mcu] setting (not to be confused with the CPU MHz rating / clock speed) and the graph shows how much the MCU is deviating from this nominal clock speed (from the perspective of the Klipper host and his clock). Spikes of a few hundred microseconds in less than a minute may indicate a problem.
  • z freq is the same as above but for a second MCU that is defined in the config as [mcu z]. It clearly shows a much higher jitter than the main MCU but if there are no spikes of a few hundred microseconds in less than a minute no issue is to be expected
  • z adj is the Klipper Host’s adjustment for this MCU to make it aligned / synchronized with the main MCU
9 Likes