Issues with stepper drift on latest Klipper

okay, v0.13.0-69-g5b2f8104 doesn’t work any more. Now it has the phase changes too.
And klipper is not able to flash because it does not create an bin file.

Version: v0.13.0-69-g5b2f8104
  Linking out/klipper.elf
  Creating hex file out/klipper.elf.hex
b'4715'
Connecting to CAN UUID 2e12ab8ec8ce on interface can0
ERROR:root:Flash Tool Error
Traceback (most recent call last):
  File "/home/voron/katapult/scripts/flash_can.py", line 1096, in main
    await sock.run()
  File "/home/voron/katapult/scripts/flash_can.py", line 805, in run
    self._check_firmware()
  File "/home/voron/katapult/scripts/flash_can.py", line 563, in _check_firmware
    raise FlashError("Invalid firmware path '%s'" % (self._fw_path))
FlashError: Invalid firmware path '/home/voron/klipper/out/klipper.bin'

Try

make clean
make distclean
make menuconfig # <----- triple check your required settings

After unistalling and reinstalling klipper via kiauh flashing works again (maybe there was a little dirt left over from the commit switching -maybe my fault).
But v0.13.0-69-g5b2f8104 has the Problem too - it takes longer go get the phase change (not only homing, QGL and moving Z up and down), but it has it.

I will repeat the bisect from good 413ff19ea to bad 1cc639807 again tomorrow, when I have a little bit more time (maybe with a little “printing”).
klippy.log (157.3 KB)

1 Like

Okay, thanks. Yeah, the “git bisect” process is an involved process, but it does help a lot. Knowing that some commits show the problem less frequently also helps.

Another test you could run would be to go back to the latest code (eg, git checkout master ; git pull ; make menuconfig ; make flash ; sudo service klipper restart), and set step_pulse_duration: 0.000000501 in all stepper config sections (the 501ns is just enough to turn off “step on both edges” optimization).

If returning to the “git bisect” test, note that if you know 5b2f8104 is bad then you can restart the test from that point (git bisect reset, git bisect start, git bisect good 413ff19ea, git bisect bad 5b2f8104). Be sure the “step pulse duration” is not set when performing the git bisect.

Thanks again,
-Kevin

EDIT: Updated typo in commands.

Tiny typo :slight_smile:

Updated:

git bisect good 413ff19ea
2 Likes

One possible explanation for what is occurring here could be a difference in rise/fall times on the step pin and the direction pin. (The Manta M8P appears to have a level shifter on these pins that may have a faster fall time than rise time.) In that situation, it is conceivable that the stepper driver could observe a direction change before a step change, even though the code issues the step change first. The recent code optimizations to stm32h7 may have lowered the time between step and dir change to the point that this type of race is now statistically more likely.

This is just a possible theory - it could easily be something else.

I’ve put together some testing code at GitHub - KevinOConnor/klipper-dev at work-stepdir-20250507 . To test it, set step_pulse_duration: 0.000000200 in all [stepper_?] config sections and load the code with something like:

sudo service klipper stop
cd ~/klipper
git fetch https://github.com/KevinOConnor/klipper-dev work-stepdir-20250507
git checkout FETCH_HEAD
sudo service klipper start

If anyone does run this test, let us know the results (success or failure).
-Kevin

I’ve spent some time with a remote buddy trying to remotely get a good oscilloscope shot from Manta V2.
Here is a questionable oscilloscope (Fnirsi 1013d), but at least this is something.


Hope that helps a little.

1 Like

FYI, Ensure minimum time between step pin and dir pin change by KevinOConnor · Pull Request #6926 · Klipper3d/klipper · GitHub

-Kevin

Hi, it’s me again.
Since things were a bit chaotic yesterday (my fault), I approached the whole thing a bit more methodically today and performed a “dry print” (printing a test file without filament) for each klipper commit:

#########################################

git bisect test for klipper v0.13

#########################################
Start with commit: cc919a5 (bisect good)
End with commit: 89ffbbe (bisect bad)

results

#########################################

commit cc919a5

#########################################
Tested by printing without filament Test-Körper.step (print time ~1h20m).
bad events in klippy.log: no

commit 89ffbbe

#########################################
Tested by printing without filament Test-Körper.step (print time ~1h20m).
bad events in klippy.log: yes
type: phase change
frequency: 22 times
steppers:

  • x: no
  • y: no
  • z: yes
  • z1: no
  • z2: yes
  • z3: no

occures during:

  • homing: no
  • QGL: yes
  • bed mesh: yes
  • printing: no

commit 7f4f696

#########################################
Tested by printing without filament Test-Körper.step (print time ~1h20m).
bad events in klippy.log: yes
type: phase change
frequency: 1 time
steppers:

  • x: no
  • y: no
  • z: no
  • z1: no
  • z2: yes
  • z3: no

occures during:

  • homing: no
  • QGL: no
  • bed mesh: yes
  • printing: no

precautionary marked as bisect bad

commit 42faa96

#########################################
Tested by printing without filament Test-Körper.step (print time ~1h20m).
bad events in klippy.log: no

commit c352617

#########################################
Tested by printing without filament Test-Körper.step (print time ~1h20m).
bad events in klippy.log: yes
type: phase change
frequency: 1 time
steppers:

  • x: no
  • y: no
  • z: no
  • z1: no
  • z2: yes
  • z3: no

occures during:

  • homing: no
  • QGL: yes
  • bed mesh: no
  • printing: no

precautionary marked as bisect bad

commit 5d1f773

#########################################
Tested by printing without filament Test-Körper.step (print time ~1h20m).
bad events in klippy.log: no

#########################################

final bisect result

#########################################
c352617c30ea71198b332233cea0e3de1d85c4a8 is the first bad commit
commit c352617c30ea71198b332233cea0e3de1d85c4a8
Author: Kevin O’Connor kevin@koconnor.net
Date: Tue Apr 22 10:52:43 2025 -0400

stm32: Use enable_pclock() in stm32h7 clock_setup()

Use the helper functions to enable the peripheral clock instead of
directly manipulating the clock enable bits.

Signed-off-by: Kevin O'Connor <kevin@koconnor.net>

src/stm32/stm32h7.c | 9 +++±----
1 file changed, 4 insertions(+), 5 deletions(-)

I hope that helps :slight_smile:

klippy(bisect).zip (1.6 MB)

Thanks for running these tests!

That is a surprising result. The changes in the identified commit appear mostly harmless. I’ll take a closer look at the code, but I suspect the error exists in prior revisions, but the probability of it showing is so low that it is hard to test. Some of your reports have frequency: 1 time and I fear it may have just been (bad) luck that it got caught that time.

For completeness could you run a test of the code at Ensure minimum time between step pin and dir pin change by KevinOConnor · Pull Request #6926 · Klipper3d/klipper · GitHub ? There’s some info on testing a PR at Testing Klipper Pull Requests . (The code on that PR is a later evolution of the code I also mentioned at Issues with stepper drift on latest Klipper - #26 by koconnor ). That PR is on top of the mainline branch, so an error should show up rapidly if it is still faulty. (Be sure to “make menuconfig”, “make flash”, “sudo service klipper restart”, etc. )

Thanks again,
-Kevin

Good news:
Pull Request 6926 produces no “phase change” events (testet 2 times) :confetti_ball:
klippy( v0.13.0-84-g8cd91bbb)_round1.log (715.3 KB)
klippy( v0.13.0-84-g8cd91bbb)_round2.log (1.6 MB)

After a real print I can say with klipper v0.13.0-84-g8cd91bbb there is no phase change and the part is dimensionally correct now.

2 Likes

That’s great! Thanks for helping to fix this problem.

It is interesting that you didn’t need a step_pulse_duration. That’s also good news.

I have committed PR #6926 to the mainline code.

-Kevin

Sorry that I couldn’t be involved in testing this week. I will try out v0.13.0-84-g8cd91bbb over the weekend too and report if it addresses the issue for me too.

Upgraded to latest (v0.13.0-89-g6f87a4e68) and I can confirm that the issue is gone without applying any additional tweaks. Thank you!

Can confirm updating from v0.13.0-87-gfd55dd9e to v0.13.0-110-g1af219fad solves the issue! Build a new Voron 2.4 and spent the better part of the weekend trying to figure out what’s wrong :smiley:

Thank you for the prompt fix!! :slight_smile:

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.