klippy.log - Statistics and Interpretation
Table of Contents
- Introduction
- 1. Host System Statistics
- 2. Print Progress Statistics
- 3. MCU Specific Statistics
- 4. CAN-bus Interface Statistics (Optional)
- 5. Temperature Sensor Statistics
Introduction
Klipper’s log file (klippy.log
) periodically outputs a “Stats” line containing a wealth of information about the printer’s performance and state. This information is key to any troubleshooting or diagnostic initiatives. In particular, the lines directly preceding an unexpected and unwanted event can contain valuable information to understand the root cause and derive appropriate corrective actions.
-
Each “Stats” line begins with a timestamp indicating the uptime of the Klipper host computer in seconds (e.g.,
Stats 1670.9:
). -
These statistics are typically generated and logged every second during printing (and at longer intervals during idle phases).
Critical Indicator: If there are gaps larger than one second between consecutive “Stats” lines during printing, it may indicate serious performance issues with the Klipper host software or an overloaded host system, warranting further investigation.
-
The “Stats” lines provide a snapshot of various subsystems:
- MCU (Microcontroller Unit) Data
- Host Data
- Print Process Data
- Temperature Sensor Data
- CAN-bus Data
Statistics Explained
The available statistics are explained below. In the log, the statistics need to be reviewed carefully to properly determine which “main” subsystem they belong to. The following article attempts to group them logically but not sequentially (as appearing in the log).
The effective content of the “Stats” line depends on the reporting devices. Some elements are optional and are only displayed when the respective hardware is present and configured.
Classification
In the following text, and wherever possible, certain ranges or limits are given. These are not necessarily the ultimate truth and should be understood considering the big picture:
- Are there constant changes over time or sharp (de-)increases?
- Is an out-of-bound value a one-off event?
- Are certain critical values present before an unexpected or unwanted event occurs?
- Is the suspicious value connected to the event or rather coincidental?
1. Host System Statistics
These values reflect the status of the computer running the Klipper host software (e.g., Raspberry Pi).
-
sysload
: The 1-minute system load average on the host computer. For example, a value of1.0
on a single-core system means the CPU was fully utilized on average over the past minute. On a multi-core system, you can assess the relative load by considering the number of CPU cores; e.g., asysload
of2.0
on a 4-core CPU indicates an average utilization of 50% (2.0/4 cores=0.5).Note: Consistently high values (e.g., approaching or exceeding the number of CPU cores) might indicate the host is overloaded, potentially impacting print performance.
-
cputime
: Cumulative CPU time consumed by the Klipper host process (klippy.py) in seconds. It shows how much processing time Klipper itself is using. -
memavail
: Available memory on the host system in kilobytes (KB), which includes buffers and cache that can be freed by the kernel if needed. This is a more practical measure of available memory than just “free” memory.Note: Persistently low available memory (e.g., consistently <25 MB or 25000 KB) might risk Out-of-Memory (OOM) errors, leading to Klipper instability or crashes.
2. Print Progress Statistics
These statistics relate to the execution of a print job.
-
gcodein
: Reports the number of bytes read from the/tmp/printer
pseudo-tty input. It will grow when using OctoPrint, but for most other uses, it will be zero. -
print_time
: This time represents the last scheduled action that was processed using the “lookahead queue”. It is a timestamp relative to the main MCU’s ([mcu]
) last restart time. It is normal for this value to increment during a print. It typically does not increment when not in a print. -
buffer_time
: The duration, in seconds, of future moves or commands that are fully planned out, and ahead of the current action being processed.Healthy values: Generally, intervals between 2 to 5 seconds during a print are considered normal. If this interval were to drop to zero during normal printing, it would indicate that the microcontrollers are processing actions faster than the host software can schedule them.
-
print_stall
: A counter indicating the number of times the printer had to pause because Klipper’s buffer (seebuffer_time
) ran empty.Healthy values: Should ideally be
0
throughout a print. Any stalls can cause blobs or other print quality artifacts. Note that stalls can also occur due to excessive host system latency (e.g., highsysload
preventing timely command processing).
3. MCU Specific Statistics
For each microcontroller connected to Klipper (e.g., [mcu]
, [mcu toolboard]
, etc.), a set of statistics is reported.
MCU Load & Task Timing
-
mcu_awake
: The time in seconds that the microcontroller’s scheduler was busy executing tasks during the last 5 seconds, rather than idling in its main loop. For example, a value of0.05
means the MCU spent 50 ms of the 5-second sample period executing tasks. -
mcu_task_avg
: The average time, in seconds, that the microcontroller spends executing each internal “task” or command. This value is also relative to a 5-second sample period. -
mcu_task_stddev
: The standard deviation of the time spent per microcontroller task. This indicates the variability in task execution times (relative to a 5-second sample period). A low standard deviation is preferred.
Communication Health & Sequencing (Host to MCU)
-
bytes_write
: The total number of bytes written (sent) by the host to this MCU. -
bytes_read
: The total number of bytes read (received) by the host from this MCU. -
bytes_retransmit
: The number of bytes that had to be resent from the host to the MCU because an acknowledgment was not received in time.Critical Indicator: This value should ideally be 0 or extremely close to it. While some retransmissions are normal during the start-up phase, a consistently climbing
bytes_retransmit
count indicates an unstable communication link. -
bytes_invalid
: The number of invalid or corrupted data bytes received by either the host or the MCU.Critical Indicator: This value should ideally be 0 or extremely close to it. While some invalid bytes are normal during the start-up phase, a substantially increasing value signifies severe data corruption in the communication link and requires investigation. Outdated CAN adapter firmware or bugs in the Linux kernel (below v6.6) are common causes of increasing
bytes_invalid
counters for CAN connections. -
send_seq
: The sequence number of the last message sent from the host to the MCU. -
receive_seq
: The sequence number of the last message sent from the host to the MCU that has been acknowledged by the MCU.Note:
send_seq
andreceive_seq
should track each other closely. Ifreceive_seq
consistently lagssend_seq
, or ifretransmit_seq
increments, it points to communication problems. -
retransmit_seq
: The sequence number related to retransmissions. An increasing value shows that messages are actively being resent. -
srtt
(Smoothed Round-Trip Time): An estimate of the average time it takes for a message to travel from the host to the MCU and for its acknowledgment to return. This is a key indicator of communication link latency and health.Healthy values: Should be low and stable.
- USB and UART are typically the most stable connections with the least jitter.
- Klipper retransmits the packet if the round-trip time exceeds a threshold around 25 ms.
-
rttvar
(Round-Trip Time Variance): A measure of the variation or jitter in the round-trip times. A stable connection has very low variance.Healthy values: Typically around 1 ms (0.001 s) or less. Increased or constantly fluctuating values can indicate an unstable connection.
-
rto
(Retransmission Timeout): The calculated timeout value for retransmissions. If an acknowledgment for a sent message isn’t received within this duration, Klipper attempts to retransmit the message. This value is dynamically adjusted based onsrtt
andrttvar
.
Command Buffering (Host to MCU Transmit Queues)
-
ready_bytes
: These are command bytes that Klipper has fully processed and are ready for immediate transmission to the MCU, currently residing in the host’s send queue for that specific MCU. Small, fluctuating values are normal as data is batched and sent. However, if this value grows large and stays elevated, it suggests the communication link cannot keep up with the data Klipper wants to send, or the MCU is not processing commands quickly enough to clear its own internal buffers. These are commands that, from Klipper’s host software perspective, could be sent immediately. -
upcoming_bytes
: These are command bytes that are scheduled for transmission at a specific future time, primarily due to available space in the MCU buffer or other mechanisms. These commands are just not yet due for transmission. It is normal for this value to grow and it does not represent an issue.
MCU Clock Frequency
freq
: The clock frequency of the microcontroller, in Hertz (Hz), as measured by the Klipper host software relative to the host clock.Note: This should align closely with the nominal clock speed of the given controller, as per its datasheet or product description, and be set internally during the compilation of the firmware. At start-up, Klipper will issue the warning
MCU xxx configured for yyy MHz but running at zzz MHz!
if the deviation exceeds 1%.adj
(Adjusted Frequency): The adjusted clock frequency of the microcontroller based on Klipper’s internal clock calibration and synchronization mechanisms. The “reference clock” is the main MCU[mcu]
.Critical Indicator: This value should be very close to the nominal
freq
. Significant deviations or fluctuations could indicate MCU clock stability or synchronization issues.
4. CAN-bus Interface Statistics (Optional)
If the Klipper setup uses CAN-bus for communication with one or more MCUs, the stats show a canstat_DEVICE_NAME:
section (e.g., canstat_TOOLBOARD:
). This reports statistics for the CANbus interface itself (often on the host or a primary MCU acting as a CAN gateway), not the Klipper application-level communication to the CAN-connected MCU.
Note: These statistics are from the perspective of the relevant CAN-connected microcontroller and may vary from the host-side Linux interfaces (like
can0
).
-
bus_state
: The current operational state of the CAN bus.active
: Normal operation. This is the desired state.warn
: Error counters on the CAN interface have reached a warning level.passive
: The CAN node has detected significant errors and will operate in a restricted mode (e.g., not actively driving error flags).off
: The CAN node has detected severe errors and has taken itself off the bus to prevent disruption. Essentially, only a USB to CAN-bus Bridge can report this, as other nodes simply drop off the bus.
Critical Indicator: Any state other than
active
indicates CAN bus problems (e.g., wiring issues like bad crimps or incorrect termination, faulty device, bus overload, EMI). -
rx_error
: A counter for receive errors detected by this specific CAN interface hardware. Should ideally be0
. -
tx_error
: A counter for transmit errors detected by this specific CAN interface hardware. Should ideally be0
. -
tx_retries
: The number of times CAN frames had to be retransmitted at the low-level CAN bus hardware layer by this interface.Healthy values: While occasional, very infrequent retries might occur on a noisy bus, this value should be
0
or extremely low. Consistently high or rapidly increasingtx_retries
points to significant CAN bus communication issues, potentially caused by hardware, configuration or wiring issues.Note:
- This
tx_retries
for thecanstat
interface is different from an MCU’sbytes_retransmit
statistic.canstat tx_retries
refers to CAN-frame retransmissions on the physical bus layer, while an MCU’sbytes_retransmit
refers to Klipper protocol message retransmissions over its established communication link layer. - Certain microcontrollers report only some of these stats. The STM32 MCUs don’t report
tx_retries
, and the RP2040 MCUs don’t reporttx_error
.
- This
5. Temperature Sensor Statistics
This section shows data from configured heaters and various temperature sensors.
-
For heaters or sensors associated with a control loop (e.g.,
heater_bed:
,extruder:
):target
: The target temperature set for the heater in Celsius (°C).temp
: The current measured temperature in Celsius (°C).pwm
: The current PWM (Pulse Width Modulation) duty cycle being applied to the heater, ranging from 0.0 (0%, off) to 1.0 (100%, full power). This shows how hard the heater is working. Apwm
value of0.0
indicates 0% power is being applied at that specific instant. However, during active PID control, brief 0% PWM cycles are normal as the heater coasts to the target temperature or as the PID algorithm makes fine adjustments (including temporarily cutting power) to maintain the setpoint accurately.
-
For other temperature sensors without control loop (e.g.,
[temperature_sensor]
, host temperature sensor, MCU temperature sensors):temp
: The current measured temperature in Celsius (°C). These can be MCU internal temperatures, ambient temperatures in enclosures, or other monitored points relevant to the printer’s operation.