Max rtt tracking

nefelim4ag · February 15, 2025, 2:49am

There are some communication-related PRs #6823 and #6811, and I just ask myself a question:
“How can I determine if there is a bus issue? Like retransmits, kernel bus waiting, can internal stuff?”

Like, maybe I do not understand something, but srtt/rttvar/rto are interesting, but most of the time are just constants srtt=0.001 rttvar=0.001 rto=0.025, bandwidth is complicated because of different communication possibilities available now and cumulative, so feels unresponsive.

So, I got to the thought, that maybe, in case of bus issues (does not matter where they are between Klippy and MCU), there should be a latency spike and it is enough to say, there is something in between, and distinguish issues like python gc slow, CPU overloaded, memory pressure & etc.

This is the suggested solution, maybe there is a better one:

diff --git a/klippy/chelper/serialqueue.c b/klippy/chelper/serialqueue.c
index c207495c..26a9c128 100644
--- a/klippy/chelper/serialqueue.c
+++ b/klippy/chelper/serialqueue.c
@@ -56,7 +56,7 @@ struct serialqueue {
     uint64_t send_seq, receive_seq;
     uint64_t ignore_nak_seq, last_ack_seq, retransmit_seq, rtt_sample_seq;
     struct list_head sent_queue;
-    double srtt, rttvar, rto;
+    double srtt, rttvar, rto, maxrtt;
     // Pending transmission message queues
     struct list_head pending_queues;
     int ready_bytes, upcoming_bytes, need_ack_bytes, last_ack_bytes;
@@ -204,6 +204,11 @@ update_receive_seq(struct serialqueue *sq, double eventtime, uint64_t rseq)
         else if (sq->rto > MAX_RTO)
             sq->rto = MAX_RTO;
         sq->rtt_sample_seq = 0;
+        // Track spikes
+        double half_rtt = delta / 2.0;
+        sq->maxrtt *= exp(-0.1 * half_rtt);
+        if (half_rtt > sq->maxrtt)
+            sq->maxrtt = half_rtt;
     }
     if (list_empty(&sq->sent_queue)) {
         pollreactor_update_timer(sq->pr, SQPT_RETRANSMIT, PR_NEVER);
@@ -932,13 +937,13 @@ serialqueue_get_stats(struct serialqueue *sq, char *buf, int len)
     snprintf(buf, len, "bytes_write=%u bytes_read=%u"
              " bytes_retransmit=%u bytes_invalid=%u"
              " send_seq=%u receive_seq=%u retransmit_seq=%u"
-             " srtt=%.3f rttvar=%.3f rto=%.3f"
+             " srtt=%.3f rttvar=%.3f rto=%.3f maxrtt=%.3f"
              " ready_bytes=%u upcoming_bytes=%u"
              , stats.bytes_write, stats.bytes_read
              , stats.bytes_retransmit, stats.bytes_invalid
              , (int)stats.send_seq, (int)stats.receive_seq
              , (int)stats.retransmit_seq
-             , stats.srtt, stats.rttvar, stats.rto
+             , stats.srtt, stats.rttvar, stats.rto, stats.maxrtt
              , stats.ready_bytes, stats.upcoming_bytes);
 }

Just track max rtt with continuous decay, I hope it is slow enough to show in the stats when there is a one-time spike once per second.
There will be a spike in case of missed scheduling of the next digital event, timer too close, or anything, CAN-related #6810.
In case of uart connection and too high stepping, which overloads the bus (who knows), there will be a spike.

Like, I think it would behave like that. If there is a way to reproduce some things, like trsync_timeout, it would be interesting to look at this.

From my surface testing, it shows the spikes when there is a lot of data transmission from the data_logger.py, my hacky bus_bandwidth tests, or homing under dump load:

mcu: mcu_awake=2.628 mcu_task_avg=0.000011 mcu_task_stddev=0.000000 bytes_write=2236826 bytes_read=30768335 bytes_retransmit=9 bytes_invalid=0 send_seq=42316 receive_seq=42316 retransmit_seq=2 srtt=0.001 rttvar=0.000 rto=0.025 maxrtt=0.001
mcu: mcu_awake=2.645 mcu_task_avg=0.000011 mcu_task_stddev=0.000000 bytes_write=2061871 bytes_read=17998939 bytes_retransmit=9 bytes_invalid=0 send_seq=38529 receive_seq=38528 retransmit_seq=2 srtt=0.000 rttvar=0.000 rto=0.025 maxrtt=0.002

Thanks.

I use USB serial and can test USB CAN if needed, but I lack a real can/serial setup.

P.S.
Looks like this code only tracks rtt for a valid transmission, so, there should be something a little different to track max rtt with accounting for retransmissions, if it is possible to do so reliably.

koconnor · February 16, 2025, 2:19am

That’s an interesting idea. I don’t know if it would be a useful diagnostic tool or not. Certainly worth testing.

Have you considered tracking and reporting the stddev of the measured round trip time?

-Kevin

nefelim4ag · February 16, 2025, 7:58pm

Have you considered tracking and reporting the stddev of the measured round trip time?

Now, I did spend some time thinking about it.
I think stddev makes sense with mean for guessing normal distribution.
But I don’t know how that information can help to guess if there is an issue in the communication path and if there is a normal distribution at all (now I think it is not normal).

Part of my real job is to monitor and debug things, so I make heat maps and percentiles sometimes to see how the services behaving.
So, I moved from the idea of percentiles to just max RTT, because percentiles show us the distribution of “normal” values, not how far they can be.
If we wish to get the idea of distribution without too much logging, maybe percentiles may help here. But they also somewhat can skew the real picture.

Below are just my small debugging and how the rtt distribution looks on my system.

I did something dumb:

diff --git a/klippy/chelper/serialqueue.c b/klippy/chelper/serialqueue.c
index 26a9c128..3f4caa5b 100644
--- a/klippy/chelper/serialqueue.c
+++ b/klippy/chelper/serialqueue.c
@@ -28,6 +28,7 @@
 #include "pollreactor.h" // pollreactor_alloc
 #include "pyhelper.h" // get_monotonic
 #include "serialqueue.h" // struct queue_message
+#include "stdio.h"
 
 struct command_queue {
     struct list_head upcoming_queue, ready_queue;
@@ -209,6 +210,11 @@ update_receive_seq(struct serialqueue *sq, double eventtime, uint64_t rseq)
         sq->maxrtt *= exp(-0.1 * half_rtt);
         if (half_rtt > sq->maxrtt)
             sq->maxrtt = half_rtt;
+        char path[128];
+        sprintf(path, "/tmp/rtt_%i.log", sq->serial_fd); // tmpfs used for /tmp
+        FILE *fd = fopen(path, "a");
+        fprintf(fd, "%i\n", (int)(half_rtt * 1000000));
+        fclose(fd);
     }
     if (list_empty(&sq->sent_queue)) {
         pollreactor_update_timer(sq->pr, SQPT_RETRANSMIT, PR_NEVER);

I gathered samples by doing homing, bed meshing, and short printing (USB serial mcu stm32h7, USB serial ebb42, Linux MCU).

$ lsof /dev/serial/by-id/usb-Klipper_stm32h723xx_0C0024001951313434373135-if00
COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
python  2163 user   12uW  CHR  166,1      0t0  797 /dev/serial/by-id/../../ttyACM1
$ lsof /dev/serial/by-id/usb-Klipper_stm32g0b1xx_4D003A000C50425539393020-if00
COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
python  2163 user   20uW  CHR  166,0      0t0  790 /dev/serial/by-id/../../ttyACM0

$ wc -l rtt*
 19140 rtt_12.log // main MCU
  4780 rtt_20.log // EBB42
   7949 rtt_27.log // Linux MCU

There are graphs:

MCU

EBB42

Linux MCU

graph code

#!/usr/bin/python
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

file_path = "rtt_20.log"
with open(file_path, "r") as f:
    data = np.array([int(line.strip()) for line in f if line.strip().isdigit()])

percentiles = [50, 90, 95, 99]
percentile_values = np.percentile(data, percentiles)

plt.figure(figsize=(12, 6))
sns.histplot(data, bins=300, kde=True, stat="density", color="blue", alpha=0.6)

for p, value in zip(percentiles, percentile_values):
    plt.axvline(value, color="red", linestyle="dashed", label=f"{p}th percentile: {value:.2f} µs")

plt.xlabel("Round Trip Time (us)")
plt.ylabel("Density")
plt.title("Histogram, Kernel Density Estimation, and Percentiles of RTT Data")
plt.legend()

plt.show()

So, it looks strange, I may be wrong here, but the RTT code probably measures the time between sending and ack receive time.
Ack received only if the message has been processed, so slow i2c io can add peaks here.

int_fast8_t
command_find_and_dispatch(uint8_t *buf, uint_fast8_t buf_len
                          , uint_fast8_t *pop_count)
{
    int_fast8_t ret = command_find_block(buf, buf_len, pop_count);
    if (ret > 0) {
        command_dispatch(buf, *pop_count);
        command_send_ack();
    }
    return ret;
}

Like on the MCU graph, the 99% percentile shows that most rtt samples lie below ~950us.
There is sht31 with 100kHz i2c, 3 bytes write, 7 bytes read.
(3 + 7) * 9 / 100_000 = 0.000_900

On Linux MCU there is BME680 400kHz, so:
(2 + 2) * 9 / 400_000 = 0.000_090000 status
(2 + 3) * 9 / 400_000 = 0.000_112500 Gas
(2 + 9) * 9 / 400_000 = 0.000_247500 Temp/Pressure

But I can’t see corresponding peaks on any of the graphs, so I may be wrong here.

koconnor · February 17, 2025, 5:36pm

Interesting. Just a couple of random thoughts in no particular order:

Your logging code may itself be altering the pattern of latency reports. (The fopen/fclose and similar are fairly expensive.)
It’s not just an ack message that can result in a call to update_receive_seq(). Any response in the message block will cause the sequence number to update (and synchronous responses are processed prior to an ack message). I don’t think this alters your analysis though.
I can see where an explicit minimum and maximum could be useful. I can also see where variance could be useful. In both cases, picking an appropriate decay rate is likely important.
Getting good stats in the face of retransmits could be a challenge. Might be possible to ignore anything that could have been retransmitted, but that can also skew results too. Not sure.

Cheers,
-Kevin

nefelim4ag · February 17, 2025, 9:36pm

Logging code definitely could be better, this is just to get some sort of idea.

I did spend some time researching the topic further, for benchmarking proposes it is better to just do high-resolution histograms like there can be an array for fixed precision latency tracking. Like up to 1 us, with just 16bit counters 2Mb of memory, or 10us precision → 0.2Mb.
(Like in this blog post HDR Histograms)

Another dumb statistics output

diff --git a/klippy/chelper/serialqueue.c b/klippy/chelper/serialqueue.c
index 26a9c128..f869be95 100644
--- a/klippy/chelper/serialqueue.c
+++ b/klippy/chelper/serialqueue.c
@@ -28,6 +28,7 @@
 #include "pollreactor.h" // pollreactor_alloc
 #include "pyhelper.h" // get_monotonic
 #include "serialqueue.h" // struct queue_message
+#include "stdio.h"
 
 struct command_queue {
     struct list_head upcoming_queue, ready_queue;
@@ -72,6 +73,8 @@ struct serialqueue {
     struct list_head old_sent, old_receive;
     // Stats
     uint32_t bytes_write, bytes_read, bytes_retransmit, bytes_invalid;
+    // Histogram 1us precision
+    uint32_t historgram[50000];
 };
 
 #define SQPF_SERIAL 0
@@ -206,6 +209,11 @@ update_receive_seq(struct serialqueue *sq, double eventtime, uint64_t rseq)
         sq->rtt_sample_seq = 0;
         // Track spikes
         double half_rtt = delta / 2.0;
+        uint32_t usec = (int)(half_rtt * 1000000);
+        if (usec < sizeof(sq->historgram)/4)
+            sq->historgram[usec] += 1;
+        else
+            sq->historgram[50000-1] += 1;
         sq->maxrtt *= exp(-0.1 * half_rtt);
         if (half_rtt > sq->maxrtt)
             sq->maxrtt = half_rtt;
@@ -930,9 +938,20 @@ void __visible
 serialqueue_get_stats(struct serialqueue *sq, char *buf, int len)
 {
     struct serialqueue stats;
+    char path[128];
+    sprintf(path, "/tmp/rtt_%i.stats", sq->serial_fd);
+    FILE *fd = fopen(path, "w");
     pthread_mutex_lock(&sq->lock);
     memcpy(&stats, sq, sizeof(stats));
     pthread_mutex_unlock(&sq->lock);
+    for (uint32_t i = 0; i < sizeof(sq->historgram)/4; i++) {
+        if (stats.historgram[i] == 0)
+            continue;
+        uint32_t count = stats.historgram[i];
+        uint32_t usec = i;
+        fprintf(fd, "%u %u\n", usec, count);
+    }
+    fclose(fd);
 
     snprintf(buf, len, "bytes_write=%u bytes_read=%u"
              " bytes_retransmit=%u bytes_invalid=%u"

Updated graph tool with mean and stddev

#!/usr/bin/python3
import sys
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

file_path = sys.argv[1]
data = np.loadtxt(file_path)

rtt_values = data[:, 0]
counts = data[:, 1]

mean_value = np.average(rtt_values, weights=counts)
stddev_value = np.sqrt(np.average((rtt_values - mean_value) ** 2, weights=counts))

percentiles = [50, 90, 95, 99]
percentile_values = np.percentile(np.repeat(rtt_values, counts.astype(int)), percentiles)

plt.figure(figsize=(12, 6))
sns.histplot(x=rtt_values, weights=counts, bins=200, kde=True, stat="density", color="blue", alpha=0.6)

for p, value in zip(percentiles, percentile_values):
    plt.axvline(value, color="red", linestyle="dashed", label=f"{p}th percentile: {value:.2f} µs")

plt.axvline(mean_value, color="green", linestyle="dashed", label=f"Mean: {mean_value:.2f} µs")
plt.axvline(mean_value - stddev_value, color="purple", linestyle="dotted", label=f"Mean - 1σ: {mean_value - stddev_value:.2f} µs")
plt.axvline(mean_value + stddev_value, color="purple", linestyle="dotted", label=f"Mean + 1σ: {mean_value + stddev_value:.2f} µs")

plt.xlabel("Round Trip Time (us)")
plt.ylabel("Density")
plt.title("Histogram, Kernel Density Estimation, and Percentiles of RTT Data")
plt.legend()

# Show plot
plt.show()

I can see where an explicit minimum and maximum could be useful. I can also see where variance could be useful. In both cases, picking an appropriate decay rate is likely important.

I am just afraid of overload logs, so, trying to be conservative.
I can add here a minimum RTT value tracking.

If we know that the stats code is called once and only from stats to print out serial queue statistics, then we can simply reset min/max RTT values and completely avoid decay which can screw the results.

I will try to gather a little more statistics from people, to get more full picture of how it could look in different setups.

nefelim4ag · February 21, 2025, 6:11pm

So, I gathered some results from people.

From the host with CAN,

Manta m8p v2.0 + ebb42 + btt mmb v1
CANBUS_FREQUENCY=1000000

SBC -> USB CAN Bridge MCU
   MCU -> CAN EBB
   MCU -> CAN MMU (MMB CAN V1.0)

So, maxrtt is 12ms according to graphs and according to logs.
(my current decay is conservative and will half the max value only in the ~5s)
Graphs:

MCU

MMU

EBB

And corresponding logs

Git version: 'v0.12.0-404-g80d185c94-dirty'
Untracked files: klippy/extras/mmu_encoder.py, klippy/extras/mmu_led_effect.py, klippy/extras/mmu_leds.py, klippy/extras/mmu_machine.py, klippy/extras/mmu_sensors.py, klippy/extras/mmu_servo.py
Modified files: klippy/chelper/serialqueue.c
Branch: master
Remote: origin
Tracked URL: https://github.com/Klipper3d/klipper
....
Stats 89833.0: gcodein=0  mcu: mcu_awake=0.017 mcu_task_avg=0.000002 mcu_task_stddev=0.000001 bytes_write=327857 bytes_read=207209 bytes_retransmit=0 bytes_invalid=0 send_seq=9607 receive_seq=9607 retransmit_seq=0 srtt=0.003 rttvar=0.002 rto=0.025 maxrtt=0.006 ready_bytes=45 upcoming_bytes=803 freq=399994124 mmu: mcu_awake=0.014 mcu_task_avg=0.000013 mcu_task_stddev=0.000021 bytes_write=133563 bytes_read=172711 bytes_retransmit=0 bytes_invalid=0 send_seq=4889 receive_seq=4889 retransmit_seq=0 srtt=0.003 rttvar=0.002 rto=0.025 maxrtt=0.004 ready_bytes=16 upcoming_bytes=0 freq=63999942 adj=64001007 ebb42: mcu_awake=0.014 mcu_task_avg=0.000013 mcu_task_stddev=0.000023 bytes_write=102370 bytes_read=115590 bytes_retransmit=0 bytes_invalid=0 send_seq=3644 receive_seq=3644 retransmit_seq=0 srtt=0.003 rttvar=0.003 rto=0.025 maxrtt=0.007 ready_bytes=57 upcoming_bytes=0 freq=63999382 adj=64000592 sd_pos=32793 heater_bed: target=100 temp=100.1 pwm=0.391 board: temp=44.2 Box: temp=31.9 sysload=0.88 cputime=100.927 memavail=7443200 print_time=564.427 buffer_time=2.364 print_stall=0 extruder: target=250 temp=251.0 pwm=0.514
Stats 89834.0: gcodein=0  mcu: mcu_awake=0.017 mcu_task_avg=0.000002 mcu_task_stddev=0.000001 bytes_write=331793 bytes_read=207851 bytes_retransmit=0 bytes_invalid=0 send_seq=9674 receive_seq=9674 retransmit_seq=0 srtt=0.002 rttvar=0.001 rto=0.025 maxrtt=0.006 ready_bytes=35 upcoming_bytes=1629 freq=399994124 mmu: mcu_awake=0.014 mcu_task_avg=0.000013 mcu_task_stddev=0.000021 bytes_write=134933 bytes_read=173114 bytes_retransmit=0 bytes_invalid=0 send_seq=4915 receive_seq=4915 retransmit_seq=0 srtt=0.003 rttvar=0.002 rto=0.025 maxrtt=0.004 ready_bytes=15 upcoming_bytes=0 freq=63999942 adj=64000654 ebb42: mcu_awake=0.014 mcu_task_avg=0.000013 mcu_task_stddev=0.000023 bytes_write=104021 bytes_read=115824 bytes_retransmit=0 bytes_invalid=0 send_seq=3673 receive_seq=3673 retransmit_seq=0 srtt=0.003 rttvar=0.003 rto=0.025 maxrtt=0.007 ready_bytes=7 upcoming_bytes=0 freq=63999379 adj=64000246 sd_pos=33196 heater_bed: target=100 temp=100.1 pwm=0.449 board: temp=43.8 Box: temp=31.9 sysload=0.88 cputime=101.321 memavail=7442868 print_time=565.366 buffer_time=2.302 print_stall=0 extruder: target=250 temp=251.2 pwm=0.423
Stats 89835.0: gcodein=0  mcu: mcu_awake=0.017 mcu_task_avg=0.000002 mcu_task_stddev=0.000001 bytes_write=336156 bytes_read=208592 bytes_retransmit=0 bytes_invalid=0 send_seq=9752 receive_seq=9752 retransmit_seq=0 srtt=0.001 rttvar=0.000 rto=0.025 maxrtt=0.006 ready_bytes=15 upcoming_bytes=4170 freq=399994125 mmu: mcu_awake=0.014 mcu_task_avg=0.000013 mcu_task_stddev=0.000021 bytes_write=136880 bytes_read=173562 bytes_retransmit=0 bytes_invalid=0 send_seq=4950 receive_seq=4950 retransmit_seq=0 srtt=0.002 rttvar=0.001 rto=0.025 maxrtt=0.004 ready_bytes=17 upcoming_bytes=0 freq=63999940 adj=64000678 ebb42: mcu_awake=0.014 mcu_task_avg=0.000013 mcu_task_stddev=0.000023 bytes_write=106247 bytes_read=116117 bytes_retransmit=0 bytes_invalid=0 send_seq=3711 receive_seq=3711 retransmit_seq=0 srtt=0.003 rttvar=0.003 rto=0.025 maxrtt=0.008 ready_bytes=8 upcoming_bytes=0 freq=63999391 adj=64000213 sd_pos=33831 heater_bed: target=100 temp=100.0 pwm=0.529 board: temp=44.1 Box: temp=31.8 sysload=0.88 cputime=101.740 memavail=7443028 print_time=566.594 buffer_time=2.530 print_stall=0 extruder: target=250 temp=251.0 pwm=0.488
...
Stats 96712.9: gcodein=0  mcu: mcu_awake=0.019 mcu_task_avg=0.000002 mcu_task_stddev=0.000001 bytes_write=622741 bytes_read=400202 bytes_retransmit=0 bytes_invalid=0 send_seq=18865 receive_seq=18865 retransmit_seq=0 srtt=0.001 rttvar=0.001 rto=0.025 maxrtt=0.005 ready_bytes=0 upcoming_bytes=0 freq=399993522 mmu: mcu_awake=0.013 mcu_task_avg=0.000013 mcu_task_stddev=0.000020 bytes_write=232438 bytes_read=326565 bytes_retransmit=0 bytes_invalid=0 send_seq=8852 receive_seq=8852 retransmit_seq=0 srtt=0.001 rttvar=0.000 rto=0.025 maxrtt=0.005 ready_bytes=0 upcoming_bytes=0 freq=64000021 adj=64000677 ebb42: mcu_awake=0.019 mcu_task_avg=0.000013 mcu_task_stddev=0.000022 bytes_write=186915 bytes_read=167298 bytes_retransmit=0 bytes_invalid=0 send_seq=6212 receive_seq=6212 retransmit_seq=0 srtt=0.001 rttvar=0.000 rto=0.025 maxrtt=0.012 ready_bytes=0 upcoming_bytes=30 freq=63999426 adj=64000465 sd_pos=52188 heater_bed: target=100 temp=99.8 pwm=0.501 board: temp=47.7 Box: temp=45.3 sysload=1.09 cputime=194.376 memavail=7438236 print_time=964.774 buffer_time=2.242 print_stall=0 extruder: target=235 temp=233.0 pwm=0.494
Stats 96713.9: gcodein=0  mcu: mcu_awake=0.019 mcu_task_avg=0.000002 mcu_task_stddev=0.000001 bytes_write=622902 bytes_read=400613 bytes_retransmit=0 bytes_invalid=0 send_seq=18877 receive_seq=18877 retransmit_seq=0 srtt=0.001 rttvar=0.000 rto=0.025 maxrtt=0.005 ready_bytes=11 upcoming_bytes=0 freq=399993462 mmu: mcu_awake=0.013 mcu_task_avg=0.000013 mcu_task_stddev=0.000020 bytes_write=232500 bytes_read=326863 bytes_retransmit=0 bytes_invalid=0 send_seq=8857 receive_seq=8857 retransmit_seq=0 srtt=0.001 rttvar=0.000 rto=0.025 maxrtt=0.005 ready_bytes=0 upcoming_bytes=0 freq=64000012 adj=64000822 ebb42: mcu_awake=0.019 mcu_task_avg=0.000013 mcu_task_stddev=0.000022 bytes_write=186990 bytes_read=167412 bytes_retransmit=0 bytes_invalid=0 send_seq=6217 receive_seq=6217 retransmit_seq=0 srtt=0.001 rttvar=0.000 rto=0.025 maxrtt=0.012 ready_bytes=0 upcoming_bytes=10 freq=63999410 adj=64000388 sd_pos=52188 heater_bed: target=100 temp=99.8 pwm=0.574 board: temp=47.5 Box: temp=45.2 sysload=1.09 cputime=194.565 memavail=7438284 print_time=964.774 buffer_time=1.241 print_stall=0 extruder: target=235 temp=233.1 pwm=0.551
Stats 96714.9: gcodein=0  mcu: mcu_awake=0.019 mcu_task_avg=0.000002 mcu_task_stddev=0.000001 bytes_write=623111 bytes_read=400995 bytes_retransmit=0 bytes_invalid=0 send_seq=18892 receive_seq=18892 retransmit_seq=0 srtt=0.001 rttvar=0.000 rto=0.025 maxrtt=0.005 ready_bytes=0 upcoming_bytes=0 freq=399993412 mmu: mcu_awake=0.013 mcu_task_avg=0.000013 mcu_task_stddev=0.000020 bytes_write=232562 bytes_read=327161 bytes_retransmit=0 bytes_invalid=0 send_seq=8862 receive_seq=8862 retransmit_seq=0 srtt=0.001 rttvar=0.000 rto=0.025 maxrtt=0.005 ready_bytes=0 upcoming_bytes=0 freq=64000005 adj=64000833 ebb42: mcu_awake=0.002 mcu_task_avg=0.000010 mcu_task_stddev=0.000010 bytes_write=187066 bytes_read=167554 bytes_retransmit=0 bytes_invalid=0 send_seq=6222 receive_seq=6222 retransmit_seq=0 srtt=0.001 rttvar=0.000 rto=0.025 maxrtt=0.012 ready_bytes=0 upcoming_bytes=0 freq=63999409 adj=64000299 sd_pos=52188 heater_bed: target=100 temp=99.8 pwm=0.577 board: temp=47.5 Box: temp=45.2 sysload=1.09 cputime=194.736 memavail=7438284 print_time=964.774 buffer_time=0.240 print_stall=0 extruder: target=235 temp=233.6 pwm=0.399

I have a copy of the raw files if they are valuable here.

nefelim4ag · February 21, 2025, 6:33pm

Another one, much shorter, just a resonance test:

All compiled with: v0.12.0-439-g1fc6d214f
CANBUS_FREQUENCY=1000000
USB U2C:
  U2C -> MCU
  U2C -> EBB42
Linux MCU

Graphs:

EBB42

Unsorted

I’m not so successful in guessing which one is which, but interesting, that Ebb also has a high max rtt ~12ms.

Corresponding log

Git version: 'v0.12.0-439-g1fc6d214-dirty'
Untracked files: klippy/extras/autotune_tmc.py, klippy/extras/gcode_shell_command.py, klippy/extras/motor_constants.py
Modified files: klippy/chelper/serialqueue.c
...
Stats 444.9: gcodein=0  canstat_mcu: bus_state=active rx_error=0 tx_error=0 tx_retries=0 mcu: mcu_awake=0.005 mcu_task_avg=0.000001 mcu_task_stddev=0.000003 bytes_write=402070 bytes_read=115435 bytes_retransmit=0 bytes_invalid=0 send_seq=10038 receive_seq=10038 retransmit_seq=0 srtt=0.001 rttvar=0.001 rto=0.025 maxrtt=0.002 ready_bytes=15 upcoming_bytes=0 freq=400021828 canstat_EBB42: bus_state=active rx_error=0 tx_error=0 tx_retries=0 EBB42: mcu_awake=1.528 mcu_task_avg=0.000073 mcu_task_stddev=0.000036 bytes_write=20340 bytes_read=2426816 bytes_retransmit=16 bytes_invalid=0 send_seq=2406 receive_seq=2406 retransmit_seq=2256 srtt=0.001 rttvar=0.000 rto=0.025 maxrtt=0.012 ready_bytes=0 upcoming_bytes=0 freq=63999608 adj=63995682 canstat_U2C: bus_state=active rx_error=0 tx_error=0 tx_retries=0 U2C: mcu_awake=0.357 mcu_task_avg=0.000011 mcu_task_stddev=0.000004 bytes_write=2663 bytes_read=9809 bytes_retransmit=0 bytes_invalid=0 send_seq=417 receive_seq=417 retransmit_seq=0 srtt=0.000 rttvar=0.000 rto=0.025 maxrtt=0.001 ready_bytes=0 upcoming_bytes=0 freq=64065961 adj=64048830 RPI4B: mcu_awake=0.000 mcu_task_avg=0.000004 mcu_task_stddev=0.000005 bytes_write=1768 bytes_read=7592 bytes_retransmit=0 bytes_invalid=0 send_seq=267 receive_seq=267 retransmit_seq=0 srtt=0.000 rttvar=0.000 rto=0.025 maxrtt=0.002 ready_bytes=0 upcoming_bytes=0 freq=49999365 adj=49996737 BED_PT1000: temp=28.2 Octopus: temp=36.4 heater_bed: target=0 temp=26.7 pwm=0.000 EBB42: temp=33.6 RPI4B: temp=35.0  sysload=0.13 cputime=20.441 memavail=7446976 print_time=156.115 buffer_time=2.167 print_stall=0 extruder: target=0 temp=29.1 pwm=0.000
Testing frequency: 132 Hz
Stats 445.9: gcodein=0  canstat_mcu: bus_state=active rx_error=0 tx_error=0 tx_retries=0 mcu: mcu_awake=0.005 mcu_task_avg=0.000001 mcu_task_stddev=0.000003 bytes_write=406380 bytes_read=116146 bytes_retransmit=0 bytes_invalid=0 send_seq=10119 receive_seq=10119 retransmit_seq=0 srtt=0.001 rttvar=0.001 rto=0.025 maxrtt=0.002 ready_bytes=42 upcoming_bytes=0 freq=400021856 canstat_EBB42: bus_state=active rx_error=0 tx_error=0 tx_retries=0 EBB42: mcu_awake=1.528 mcu_task_avg=0.000073 mcu_task_stddev=0.000036 bytes_write=20422 bytes_read=2445776 bytes_retransmit=16 bytes_invalid=0 send_seq=2418 receive_seq=2418 retransmit_seq=2256 srtt=0.001 rttvar=0.000 rto=0.025 maxrtt=0.012 ready_bytes=0 upcoming_bytes=0 freq=63999606 adj=63995658 canstat_U2C: bus_state=active rx_error=0 tx_error=0 tx_retries=0 U2C: mcu_awake=0.365 mcu_task_avg=0.000011 mcu_task_stddev=0.000004 bytes_write=2675 bytes_read=9857 bytes_retransmit=0 bytes_invalid=0 send_seq=419 receive_seq=419 retransmit_seq=0 srtt=0.001 rttvar=0.001 rto=0.025 maxrtt=0.001 ready_bytes=0 upcoming_bytes=0 freq=64065743 adj=64046944 RPI4B: mcu_awake=0.000 mcu_task_avg=0.000004 mcu_task_stddev=0.000005 bytes_write=1774 bytes_read=7608 bytes_retransmit=0 bytes_invalid=0 send_seq=268 receive_seq=268 retransmit_seq=0 srtt=0.000 rttvar=0.000 rto=0.025 maxrtt=0.002 ready_bytes=0 upcoming_bytes=0 freq=49999366 adj=49996767 BED_PT1000: temp=28.2 Octopus: temp=36.5 heater_bed: target=0 temp=26.8 pwm=0.000 EBB42: temp=33.6 RPI4B: temp=35.0  sysload=0.20 cputime=20.557 memavail=7446476 print_time=157.047 buffer_time=2.098 print_stall=0 extruder: target=0 temp=29.0 pwm=0.000
Testing frequency: 133 Hz
Stats 446.9: gcodein=0  canstat_mcu: bus_state=active rx_error=0 tx_error=0 tx_retries=0 mcu: mcu_awake=0.005 mcu_task_avg=0.000001 mcu_task_stddev=0.000003 bytes_write=411258 bytes_read=116966 bytes_retransmit=0 bytes_invalid=0 send_seq=10216 receive_seq=10213 retransmit_seq=0 srtt=0.002 rttvar=0.001 rto=0.025 maxrtt=0.002 ready_bytes=9 upcoming_bytes=174 freq=400021857 canstat_EBB42: bus_state=active rx_error=0 tx_error=0 tx_retries=0 EBB42: mcu_awake=1.529 mcu_task_avg=0.000073 mcu_task_stddev=0.000037 bytes_write=20504 bytes_read=2464524 bytes_retransmit=16 bytes_invalid=0 send_seq=2430 receive_seq=2430 retransmit_seq=2256 srtt=0.001 rttvar=0.000 rto=0.025 maxrtt=0.012 ready_bytes=0 upcoming_bytes=0 freq=63999601 adj=63995645 canstat_U2C: bus_state=active rx_error=0 tx_error=0 tx_retries=0 U2C: mcu_awake=0.365 mcu_task_avg=0.000011 mcu_task_stddev=0.000004 bytes_write=2687 bytes_read=9888 bytes_retransmit=0 bytes_invalid=0 send_seq=421 receive_seq=421 retransmit_seq=0 srtt=0.001 rttvar=0.001 rto=0.025 maxrtt=0.001 ready_bytes=0 upcoming_bytes=0 freq=64065453 adj=64046479 RPI4B: mcu_awake=0.000 mcu_task_avg=0.000004 mcu_task_stddev=0.000005 bytes_write=1780 bytes_read=7624 bytes_retransmit=0 bytes_invalid=0 send_seq=269 receive_seq=269 retransmit_seq=0 srtt=0.000 rttvar=0.000 rto=0.025 maxrtt=0.002 ready_bytes=0 upcoming_bytes=0 freq=49999367 adj=49996707 BED_PT1000: temp=28.2 Octopus: temp=36.5 heater_bed: target=0 temp=26.7 pwm=0.000 EBB42: temp=33.7 RPI4B: temp=35.0  sysload=0.20 cputime=20.681 memavail=7444992 print_time=158.246 buffer_time=2.298 print_stall=0 extruder: target=0 temp=29.0 pwm=0.000
Testing frequency: 134 Hz
Stats 447.9: gcodein=0  canstat_mcu: bus_state=active rx_error=0 tx_error=0 tx_retries=0 mcu: mcu_awake=0.005 mcu_task_avg=0.000001 mcu_task_stddev=0.000003 bytes_write=414871 bytes_read=117677 bytes_retransmit=0 bytes_invalid=0 send_seq=10291 receive_seq=10291 retransmit_seq=0 srtt=0.001 rttvar=0.001 rto=0.025 maxrtt=0.002 ready_bytes=37 upcoming_bytes=0 freq=400021824 canstat_EBB42: bus_state=active rx_error=0 tx_error=0 tx_retries=0 EBB42: mcu_awake=1.529 mcu_task_avg=0.000073 mcu_task_stddev=0.000037 bytes_write=20586 bytes_read=2483620 bytes_retransmit=16 bytes_invalid=0 send_seq=2442 receive_seq=2442 retransmit_seq=2256 srtt=0.001 rttvar=0.000 rto=0.025 maxrtt=0.012 ready_bytes=0 upcoming_bytes=0 freq=63999591 adj=63995728 canstat_U2C: bus_state=active rx_error=0 tx_error=0 tx_retries=0 U2C: mcu_awake=0.365 mcu_task_avg=0.000011 mcu_task_stddev=0.000004 bytes_write=2699 bytes_read=9919 bytes_retransmit=0 bytes_invalid=0 send_seq=423 receive_seq=423 retransmit_seq=0 srtt=0.000 rttvar=0.000 rto=0.025 maxrtt=0.001 ready_bytes=0 upcoming_bytes=0 freq=64065158 adj=64047869 RPI4B: mcu_awake=0.000 mcu_task_avg=0.000004 mcu_task_stddev=0.000005 bytes_write=1786 bytes_read=7640 bytes_retransmit=0 bytes_invalid=0 send_seq=270 receive_seq=270 retransmit_seq=0 srtt=0.000 rttvar=0.000 rto=0.025 maxrtt=0.002 ready_bytes=0 upcoming_bytes=0 freq=49999367 adj=49996701 BED_PT1000: temp=28.2 Octopus: temp=36.5 heater_bed: target=0 temp=26.7 pwm=0.000 EBB42: temp=33.5 RPI4B: temp=33.6  sysload=0.20 cputime=20.791 memavail=7444488 print_time=159.123 buffer_time=2.173 print_stall=0 extruder: target=0 temp=29.0 pwm=0.000
toolhead: max_velocity: 300.000000

nefelim4ag · April 5, 2025, 3:41pm

I did help with the debugging of Error during homing z: Communication timeout during homing.
With atmega2560 (USB2UART) and BTT Eddy (USB).
*Klipper running on a desktop class host, in the VM.

I skipped debug iterations here.
There is an added max_rtt to get the idea if it is a communication issue at all:

Stats 13814.8:
  gcodein=0 
  mcu:
    mcu_awake=0.053 mcu_task_avg=0.000133 mcu_task_stddev=0.000068
    bytes_write=13578 bytes_read=15107 bytes_retransmit=0 bytes_invalid=0
    send_seq=748 receive_seq=747 retransmit_seq=0
    srtt=0.005 rttvar=0.001 rto=0.025 maxrtt=0.005
    ready_bytes=0 upcoming_bytes=0 freq=16000754
  eddy:
    mcu_awake=0.000 mcu_task_avg=0.000002 mcu_task_stddev=0.000002
    bytes_write=2880 bytes_read=12345 bytes_retransmit=9 bytes_invalid=0
    send_seq=294 receive_seq=294 retransmit_seq=2
    srtt=0.002 rttvar=0.001 rto=0.025 maxrtt=0.005
    ready_bytes=0 upcoming_bytes=0 freq=12000747 adj=12000157

So, in the worst case there is 5ms latency to each MCU.
Probably ~10ms in sum, if it’s correct to sum them here.
Which is still <25ms.

The root cause was:

Stats 13812.8: ... cputime=2.065
Stats 13813.8: .... cputime=2.138
Stats 13814.8: .... cputime=2.459
Error during homing z: Communication timeout during homing

In that case, most probably it was a slow CPU frequency.
Running of stess-ng --cpu 4 fixes the problem.

Sineos · April 6, 2025, 8:52am

Maybe one of the reasons why it is discouraged and unsupported to run Klipper in a VM.

Topic		Replies	Views
Klipper: communication bus tests General Discussion	12	361	November 17, 2024
Why is the srtt I calculated negative General Discussion	15	104	March 3, 2025
Potential issue with Klipper timing Developers	12	192	January 21, 2025
Move queue overflow occurred in Resonance Test General Discussion	5	161	March 3, 2024
Move queue overflow with large gcode files General Discussion	6	67	November 27, 2024

Max rtt tracking

Below are just my small debugging and how the rtt distribution looks on my system.

Related topics