My Solution to "CANBUS Communication timeout while homing Z"

Basic Information:

Printer Model: Voron Tridend 250
MCU / Printerboard: BTT Octoprint 1.1
Host / SBC: FriendlyElec NanoPC-T4
klippy.log

Fill out above information and in all cases attach your klippy.log file (use zip to compress it, if too big). Pasting your printer.cfg is not needed
Be sure to check our “Knowledge Base” Category first. Most relevant items, e.g. error messages, are covered there

Describe your issue:

Summary of CAN Bus Stability Improvements (Before/After) During Long‑Duration Probe Accuracy Testing

Over the past days I’ve been troubleshooting rare but repeatable CAN‑bus communication hiccups during long‑duration PROBE_ACCURACY stress tests on my Voron Trident. These tests run continuously for about an hour while cycling bed and nozzle temperatures, which is a great way to expose timing‑sensitive issues in the CAN transport layer. For reference, I used the excellent probe stress‑test tooling from:GitHub - KiloQubit/probe_accuracy

Before the Change

Initially, my CAN interface was configured with:

  • pfifo_fast as the default qdisc
  • txqueuelen set to 128, as commonly recommended in Klipper documentation

This setup worked “mostly fine,” but during extended probe‑accuracy loops I would eventually hit a CAN timeout or communication error. Reducing txqueuelen from 128 → 64 roughly doubled the runtime before failure, but still didn’t eliminate the issue. The errors were rare, but consistent enough to interfere with long‑term probe comparison testing.

Changes Applied

I made two adjustments to the CAN interface configuration:

  1. Switched queue discipline from pfifo_fast tofq_codel

    • This was done to reduce latency spikes and avoid occasional packet reordering under load.
  2. Reduced txqueuelen from 64 → 20

    • CAN frames are extremely small, and large queues only increase latency and jitter.
    • A shorter queue forces more deterministic timing behavior.

These changes were applied via /etc/network/interfaces.d/can0 so they persist across reboots.

After the Change

With txqueuelen=20 and fq_codel active, I was able to complete a full 1‑hour probe‑accuracy stress test without any CAN communication errors. This is the first time the test has run to completion without a timeout. While I can’t yet say with certainty how much fq_codel contributed on its own, the combination of a shorter queue and a modern qdisc clearly improved stability in my setup.

This gives me a reliable baseline for comparing different Z‑probe hardware under identical conditions.

System Configuration

SBC: FriendlyElec NanoPC T4 (RK3399)
OS:
debian-bookworm-core-arm64 (FriendlyElec Official)

Controller: BTT Octoprint 1.1 based on STM32F446ZET6 configured as CAN Bridge

Toolhead Controller: Mellow SHT36v3 based on RP2040

Z-Probe: FYSETC Super Pinda

klippy.zip (1.0 MB)

This file lives in:/etc/network/interfaces.d/can0

allow-hotplug can0
iface can0 can static
    bitrate 1000000

    # Configure CAN interface before bringing it up
    pre-up ip link set can0 type can bitrate 1000000
    pre-up ip link set can0 txqueuelen 20

    # Ensure queue length is correct after interface activation
    up ip link set $IFACE txqueuelen 20

    # Apply fq_codel to reduce latency and avoid packet reordering
    post-up tc qdisc replace dev can0 root fq_codel

    # Optional: clean up qdisc on shutdown
    post-down tc qdisc del dev can0 root || true