Intermittent z-endstop probe error

Basic Information:

Printer Model: Voron 2.4 350mm
MCU / Printerboard: Octopus Pro 1.0, Toolhead SB2209 RP40 (CAN bus)
Host / SBC: Raspberry Pi 4B, 1 GB, with a SATA SSD
klippy.log

Fill out above information and in all cases attach your klippy.log file (use zip to compress it, if too big). Pasting your printer.cfg is not needed
Be sure to check our “Knowledge Base” Category first. Most relevant items, e.g. error messages, are covered there

Describe your issue:

My Voron 2.4 is now about six months old. I choose to build with CAN bus without prior awareness of how fragile multi-MCU homing can be. :slight_smile: After repeated frustration with “timer too close” errors, I’ve got it mostly nailed down:

  1. Thoroughly audited wiring to confirm the can0 interface has zero errors and dropped packets (at 1,000,000 bitrate)
  2. Moved from SD card to SATA SSD to reduce IO stress
  3. Removed a camera with a bad USB cable (took months to isolate that one)
  4. Reduced framerate in the LED effects plugin (helped) and removed it entirely (better)

My CAN-related failures are now rare enough that I’ve not succumbed to temptation to get a Rasberry Pi 5, but this week I have a strange new problem that I’m >50% sure is actually caused by CAN:

Z-homing with a Voron Tap probe fails randomly with Endstop z still triggered after retract. This has started after I added two toolhead filament sensors (for ERCF v2), wired into the SB2209 board (documented here) with these changes:

  1. Move hotend thermistor (PT1000) from TH0 to 31865 port and update Klipper config.
  2. Add an NPN jumper to allow the FAN/IND port to be used as an endstop probe.
  3. Add the two filament sensors to the IND and TH0 ports and update Klipper config.
  4. Probe port (Voron Tap) is not touched. There’s an additional unused GPIO pin in that port but it needs a pull-up resistor and breakout wiring to split between probe and sensor. Using TH0 was easier to wire up so I left this alone.

This worked until homing Z started failing intermittently. The only pattern I’ve been able to find is that it works when the machine is cold but fails when hot (after a print). I’ve also seen misreadings of the thermistor – it’ll occasionally report >1000°C and then quickly drop to normal range, but Klipper won’t shutdown for exceeding max_temp (290°C for me).

Assuming these are interlinked, I took apart both plugs and redid the thermistor wiring crimps. The spikes still happen, so I’m assuming this is related to calibration with the 31865 interface. But why does the probe flake out intermittently? The LED on the back correctly switches from blue to red when triggered. QUERY_PROBE and QUERY_ENDSTOPS both report the correct values. My only hint is in Mainsail’s Endstops box. It’s slow enough to update that I’ve managed to capture screenshots showing Endstop_Z and Probe out of sync with each other (one is triggered while the other is open, both ways).

At this point I’m wondering if this is in fact a CAN problem. Is the overhead of two additional sensors and the 31865 port causing MCU overload? How do I debug this? My klippy.log is attached. Here’s the latest stats for the can0 interface (error-warn 5 and error-pass 5 were both 0 a few hours ago):

❯ ip -details -statistics link show can0
3: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1024
    link/can  promiscuity 0 minmtu 0 maxmtu 0
    can state ERROR-ACTIVE restart-ms 0
	  bitrate 1000000 sample-point 0.750
	  tq 62 prop-seg 5 phase-seg1 6 phase-seg2 4 sjw 1
	  gs_usb: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..1024 brp-inc 1
	  termination 0 [ 0, 120 ]
	  clock 64000000
	  re-started bus-errors arbit-lost error-warn error-pass bus-off
	  0          0          0          5          5          0         numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
    RX: bytes  packets  errors  dropped missed  mcast
    4159030    635372   0       0       0       0
    TX: bytes  packets  errors  dropped carrier collsns
    3673786    548706   0       0       0       0

klippy.log.zip (3.8 MB)

Welcome jackerhack, interesting problem.

There was a Log rollover in the beginning of your attached klippy.log (line 4179). Could you please upload the klippy.log from before?

I have successfully resolved the problem by frying my toolhead board. Now I no longer have a CAN toolhead and therefore no CAN problems.

(I was attempting to pull out the endstop sensor to see if that fixed whatever was happening to the probe’s readings. I got impatient and didn’t think powering down was necessary for a hot unplug.)

Here are the logs, FWIW. I’m moving to an SB2209 USB.

klippy-logs.zip (7.4 MB)

I’m an idiot. After migrating to an SB2209 USB board and having the same problem again, I went after all suspects and found it in the place I should have looked first, in the pin config. See for yourself:

[probe]
# Printed Voron tap:
# pin: ^toolhead:PROBE_INPUT
# Chaotic Lab CNC Voron Tap requires pull down (~) instead of pull up (^):
#pin: ~!toolhead:PROBE_INPUT

# 2025-02-05: Z endstop kept failing across two boards (SB2209 RP2040 CAN and SB2209 USB),
# so I measured resistance between GPIO and ground when open and found it varying with time
# (185 ohms, 160 ohms, 165 ohms). It's consistently 2.7-3 ohms when triggered.
# I've disabled pull-down on the assumption that it is making the reading unreliable.
# I don't understand what pull-up and pull-down actually do. Do they ask the board to supply
# voltage (hardware), or do they just apply a multiplier to the reading (software)?
pin: !toolhead:PROBE_INPUT

Why did the pull-down config work for several weeks, until it didn’t? What do these pull-up and pull-down flags actually do? Change a multiplier in firmware, or cause an actual change to the current flow in hardware? If that’s not supported in hardware, does Klipper know and compensate, or is that up to the user to discover?

At least I have a working printer again. :slightly_smiling_face:

1 Like

Klipper has no knowledge of whether a pull-up or pull-down definition is required. This depends on the hardware layout of the board and the type of device attached to a specific pin.

A typical pin on a MCU can have an internal pull-up that needs to be activated via the Klipper configuration (if needed) or it could have external pull-ups that are included on the board and thus do not require Klipper configuration.

For a more in-depth explanation, see, for example, Pull-up Resistors - SparkFun Learn or any other readily available source on the web.