Clarification on bytes_retransmit vs bytes_invalid

Basic Information:

Printer Model: Ender3
MCU / Printerboard: 2xSKR Pico/FLY SHT42/MKS THR42
Host / SBC: PI Zero2W
klippy.log
klippy.log.zip (1.0 MB)

New to 3D printing user here. I have been having great fun getting into the hobby recently and have ended up with a couple of toolboard mcus on my Ender to play with.

Klipper&printer are running 100% solidly since giving up on finding a reliable USB hub and switching to CAN but I have noticed that my bytes_retransmit count climbs constantly while printing while bytes_invalid never does. I have googled this to death but never seem to get any similar examples - all posts are either about both values increasing or just refer to bytes_invalid.

As an example here are my stats after a solid 3 days of use

Stats 268505.2: gcodein=0 mcu: mcu_awake=0.017 mcu_task_avg=0.000010 mcu_task_stddev=0.000018 bytes_write=196868380 bytes_read=59638226 bytes_retransmit=248 bytes_invalid=0 send_seq=4048809 receive_seq=4048809 retransmit_seq=3471364 srtt=0.001 rttvar=0.002 rto=0.025 ready_bytes=0 upcoming_bytes=0 freq=12000257 skrcupboard: mcu_awake=0.081 mcu_task_avg=0.000012 mcu_task_stddev=0.000011 bytes_write=1104856 bytes_read=3445949 bytes_retransmit=0 bytes_invalid=0 send_seq=184094 receive_seq=184094 retransmit_seq=0 srtt=0.000 rttvar=0.000 rto=0.025 ready_bytes=0 upcoming_bytes=0 freq=12000288 adj=12000037 mcu-toolhead: mcu_awake=0.016 mcu_task_avg=0.000025 mcu_task_stddev=0.000038 bytes_write=103394501 bytes_read=39744673 bytes_retransmit=1934658 bytes_invalid=0 send_seq=2310103 receive_seq=2310103 retransmit_seq=2310053 srtt=0.002 rttvar=0.002 rto=0.025 ready_bytes=0 upcoming_bytes=0 freq=47999870 adj=47999420 mcu-gantry: mcu_awake=0.008 mcu_task_avg=0.000009 mcu_task_stddev=0.000016 bytes_write=123215237 bytes_read=23158152 bytes_retransmit=25582900 bytes_invalid=0 send_seq=2348655 receive_seq=2348655 retransmit_seq=2348633 srtt=0.004 rttvar=0.006 rto=0.028 ready_bytes=0 upcoming_bytes=0 freq=12032812 adj=12032649 sd_pos=14758852 heater_bed: target=65 temp=65.0 pwm=0.338 coldend: temp=43.8 enclosure: temp=30.7 cupboard: temp=20.8 cupboard_humid: temp=45.4 sysload=0.41 cputime=33650.638 memavail=216284 print_time=181663.204 buffer_time=2.466 print_stall=0 extruder: target=200 temp=200.2 pwm=0.447

Where the wiring is PI >USB> skrcupboard(klipper with can bridge) > CAN > mcu + mcu-gantry + mcu-toolhead

So TL;DR no issues with my setup but is bytes_retransmit climbing without any bytes_invalid any cause for concern? Any documentation I missed to help understand the difference/meaning in a CAN setup context?

Why you didn’t upload klippy.log file which we should analyze and give you our advise ???

bytes_retransmit - is increasing when klipper detect that some package of data didn’t pass thru to MCU, it’s stuck, broken, or invalid CRC detected, if klipper detect that it will re-transmit it again, this symptom usually indicate that you have communication issues with MCU, usually it’s wiring, overloaded system, EMP influence, USB issues, etc …

bytes_invalid - is increasing when klipper detects that the order of packets is wrong, each packet of data which klipper is sending have a sequential number, if klipper get response from MCU in incorrect order - it will increase this number. This symptom usually indicate that somebody was re-arranging packets, and usually it’s USB/CAN converter.

Thank you for the info! I did not really want to spam anybody with my full log when I am not actually getting an issue per se.

So I am indeed loosing data over my wiring based on that then. I was not entirely clear if that was the case before. I am seeing most of the bytes_retransmit on the two mcus that have the longest amount of cable to reach them so I guess as a weekend project I can have a look at shortening the distance and see if it reduces retransmits. Gives me an excuse to potter around in the shed again :smiley:

Thank you again :slight_smile:

Edit: log now included

Canbus specification says that you can have 1Mbit speed on 40 meter distance - so I’m not sure if your “Shortening” will help at all, but if your wire is broken somewhere - you could remove that faulty place by doing “shortening”.

Also you log is showing that you have 3 canbus MCUs, first 2 of them don’t have any troubles - retransmit is static, but third one (last in chain) have huge number of retransmits.
This tell that your chain PI >USB> skrcupboard(klipper with can bridge) > CAN > mcu + mcu-gantry - is good and you shouldn’t touch it at all.
You should concentrate on mcu-gantry + mcu-toolhead

Check your wiring CANBUS Troubleshooting - Klipper documentation
especially last chain, check connectors, voltages, resistance, probably you can try temporary to replace last wire chain.

1 Like

Ran a few tests this morning turning things on and running them to see what causes retransmit

The following runs with no retransmits

#G28
SET_HEATER_TEMPERATURE HEATER=heater_bed TARGET=60
SET_HEATER_TEMPERATURE HEATER=extruder TARGET=200
G90
G0 Z40
SDCARD_LOOP_BEGIN COUNT=99999999
G0 X180 Y40 Z42 F6000
G0 X40 Y180 Z40 F6000
SDCARD_LOOP_END

Whereas the following will cause retransmits to start climbing

#G28
SET_HEATER_TEMPERATURE HEATER=heater_bed TARGET=60
SET_HEATER_TEMPERATURE HEATER=extruder TARGET=200
TEMPERATURE_WAIT SENSOR=extruder MINIMUM=185
G90
G0 Z40
SDCARD_LOOP_BEGIN COUNT=99999999
G92 E0
G0 X180 Y40 Z42 E-5 F6000
G0 X40 Y180 Z40 E4 F6000
SDCARD_LOOP_END

So it seems to be something related to the extruder motor which is plugged into mcu-toolhead. Oddly mcu-gantry is more affected.

I have retried with the extruder motor physically unplugged and the retransmit counter still increments just as much.
Also tried just enabling and leaving still. The extruder stepper can be enabled without causing any issue, it is movement commands that cause hiccups.

Movement = Toolhead wires movement (bending + acceleration + deceleration)

In your first test where “no retransmits” - there is small amount of data going to your toolhed, just control of a extruder heater.

In you second test where “retransmit is growing” - there are much grater amount of data flowing to toolhead because it’s sending command for your Extruder stepper.

You can try to do the third test - where toolhead will be stationary (no movement) and just extruder motor will be moving filament back and forward

if retransmit will not grow during third test - then you can try to re-run it again but this time you manually will be moving and bending toolhead wires.

The bytes_retransmit statistics tracks how many duplicate bytes the Klipper host transmits to an mcu as part of command retransmits, while bytes_invalid tracks how many bytes the host read that were corrupted.

An incrementing bytes_invalid on canbus is severe, as getting corrupt data indicates packet reordering, which causes all sorts of problems.

An incrementing bytes_restransmit isn’t particularly severe. The most likely cause on canbus is an intermittent lack of bandwidth between host and mcu (such that the host doesn’t get timely acknowledgements for commands because the bus is busy).

It seems you’ve set the CANBUS frequency to 500000. Probably best to go back to the recommended default of 1000000. The higher frequency increases the amount of bandwidth available and also reduces the round-trip-time for message acknowledgements. In particular, since you have 3 devices on the canbus, having a higher bandwidth is useful.

It’s also possible that the reason there is intermittent low bandwidth on the canbus is because of line noise. That is, at a low-level the canbus may be seeing line errors and retransmitting. (Note that this retransmission is separate from Klipper’s retransmission system, is not tracked by klipper, and is only seen by klipper as reduced bandwidth.) In that case, improving the wiring and connections may reduce line noise. For what it is worth though, I’d try increasing the bandwidth first.

Cheers,
-Kevin

EDIT: Just to be clear, if you’re not experiencing any real-world issues, then there is a good chance things will remain that way. There is a possibility that constant bytes_retransmit events could eventually build up to “timer too close” errors, but it’s not common for that to occur. Your retransmits do seem high, but it also seems that Klipper is handling it so far.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.