Unable to obtain 'trsync_state' response

Hello. I am using a triple z system. When I use a probe as an endstop, everything is fine. However, if I configure 3 endstops (mechanical switches) and send z_tilt_adjust, it goes to the shutdown state and raises the errors ‘!! Unable to obtain ‘trsync_state’ response’ and ‘Timer too close’. Only change in the config is the endstop configuration.

I also noticed that if I send a G1 Z5 (any number near to 0) before z_tilt_adjust it does not go to the shutdown state and succeeds the tilt. Then, I tried to use a macro which is:

[gcode_macro G30]
gcode:
G1 Z5
Z_TILT_ADJUST

It went to the shutdown state again.

Here is my log file:
klippy.log (128.1 KB)

It appears the log file you’ve attached here has been modified. There’s not much we can do without the full unmodified log file. As mentioned at Unable to obtain 'trsync_state' response · Issue #4436 · KevinOConnor/klipper · GitHub , you’ll need to recreate the issue with the pristine software and attach the full log from the event.

-Kevin

Sorry for the wrong one. It’s here:
klippy.log (89.6 KB)

I get the same issue.
The last lines in the console in fluid are:

16:56:49 // Klipper state: Ready
16:56:54 File opened:Adapter_Hemera_v2_remix_0.2mm_PLA_ENDER3_1h57m.gcode Size:5004545
16:56:54 File selected
16:56:54 // Unknown command:"M486"
16:59:19 // Klipper state: Shutdown
16:59:20 !! Unable to obtain 'trsync_state' response

klippy (5).log (1.2 MB)
This is my log file

I have done some bugfinding using git bisect to see which commit introduces the bug.
I got this response:

983951443cf14fe2985585c2c2eb20efe526411d is the first bad commit
commit 983951443cf14fe2985585c2c2eb20efe526411d
Author: Kevin O'Connor <kevin@koconnor.net>
Date:   Mon Feb 15 16:02:40 2021 -0500

    mcu: Enable trdispatch infrastructure during homing

    Separate out trsync handling from endstop handling in mcu.py code.
    Enable the low-level trdispatch C code.

    This is in preparation for multi-mcu homing.

    Signed-off-by: Kevin O'Connor <kevin@koconnor.net>

:040000 040000 09af92e0a03ad6a6046b59d73392f7ed3b1a3cd6 6168edc549087a953300ac04b5190ce4933ced39 M  klippy

The full bisect log:

# bad: [f2b4d353d859e0fc75a13b53050812516b06302c] rp2040: Add _ramfunc macro to internal.h and use in                                                                                                                                                         bootrom.c and chipid.c
# good: [c0d860487a70a05d44973d53a981e935ce7b3ad0] stepper: Remove set_tag_position() code
git bisect start 'f2b4d35' 'c0d86048'
# bad: [a0c27571dd0ff841a1464a7df05a8cdd7a2d135d] docs: Note previous firmware state issue with TMC22                                                                                                                                                        09/TMC2209 in TMC_Drivers.md
git bisect bad a0c27571dd0ff841a1464a7df05a8cdd7a2d135d
# good: [f00281d1e670e00ebcef06076abf2116c3f2d461] config: Fixes for generic-th3d-ezboard-lite-v1.2.c                                                                                                                                                        fg
git bisect good f00281d1e670e00ebcef06076abf2116c3f2d461
# good: [b6d8cf27d25faecfa47404924a757dc09e7a84a3] serialqueue: Add serialqueue_send_one() helper fun                                                                                                                                                        ction
git bisect good b6d8cf27d25faecfa47404924a757dc09e7a84a3
# bad: [d1665fae4f48519198fcd430d5fc18f17aee4e77] docs: Update Code_Overview.md with the removal of c                                                                                                                                                        alc_tag_position()
git bisect bad d1665fae4f48519198fcd430d5fc18f17aee4e77
# bad: [983951443cf14fe2985585c2c2eb20efe526411d] mcu: Enable trdispatch infrastructure during homing
git bisect bad 983951443cf14fe2985585c2c2eb20efe526411d
# good: [e1f7748e1a9dfda3e78f8a27e9972a9eb954e946] trdispatch: Support fast responses to trsync_state                                                                                                                                                         messages
git bisect good e1f7748e1a9dfda3e78f8a27e9972a9eb954e946
# first bad commit: [983951443cf14fe2985585c2c2eb20efe526411d] mcu: Enable trdispatch infrastructure                                                                                                                                                         during homing

I think I have reproduced @LordTimmeh’s claim above on my printer.

I went back (firmware and host) to 98395144 (my log files labeled “trial 6”) and it failed on pass 2, point 5 (of 7) of the Z_TILT_ADJUST.

klippy.log.reproducible-Z-tilt-trsync-error-6.log (490.7 KB)
moonraker.log.reproducible-Z-tilt-trsync-error-6.log (46.4 KB)
CPU-uptime.log.reproducible-Z-tilt-trsync-error-6.log (574.3 KB)

I then went back (firmware and host) to e1f7748e (my log files labeled “trial 7”) and it worked (full Z_TILT_ADJUST).

klippy.log.reproducible-Z-tilt-trsync-error-7.log (352.9 KB)
moonraker.log.reproducible-Z-tilt-trsync-error-7.log (24.6 KB)
CPU-uptime.log.reproducible-Z-tilt-trsync-error-7.log (1.3 MB)

I then updated the host files only to 98395144 (firmware still e1f7748e) and it failed on pass 2, point 5 (as usual).
klippy.log.reproducible-Z-tilt-trsync-error-8.log (289.9 KB)
moonraker.log.reproducible-Z-tilt-trsync-error-8.log (36.0 KB)
CPU-uptime.log.reproducible-Z-tilt-trsync-error-8.log (823.9 KB)

I performed further tests as follows:

  • Switched Pi only to e1f7748 (trial 9) - WORKED
  • Switched Pi only to 98395 (trial 10) - FAILED
  • Switched Pi only to e1f7748 (trial 11) - WORKED

I had a theory that it might be consuming some data structure, so I added an additional point to the set of Z tilt points (to see if it would fail on the 12th point probed). It did not. It failed on the same physical point (front right) which was now the 14th point probed, not the 12th. (Pass 2, point 6 of 8; it used to be Pass 2, point 5 of 7)

  • Switched Pi only to 98395, added an 8th point at 200,10 (trial 12) - FAILED on the same physical point

  • Kept only the back middle and four corner points (trial 13) - FAILED on pass 2, point 1 (back middle)

  • Kept only the back middle and front corner points (trial 14) - FAILED on pass 2, point 1 (back middle)

Now, thinking it might be a timing issue, I decided to slow the Z drive further down.

  • Set max Z velocity to 5 (from 10) and max Z acceleration to 10 (from 30) - (trial 15) - FAILED on pass 1, point 1 (that was the opposite of what I expected!)
    [I have log files from all those trials stashed away. It seems not very likely that they’d be illuminating, so I didn’t upload them.]

  • Rebooted and tried again so I could get clean log files (trial 16) - FAILED on pass 1, point 1
    klippy.log.reproducible-Z-tilt-trsync-error-16.log (150.2 KB)
    moonraker.log.reproducible-Z-tilt-trsync-error-16.log (30.7 KB)
    CPU-uptime.log.reproducible-Z-tilt-trsync-error-16.log (485.7 KB)

  • Changed microsteps to 16 (from 1) and step distance to 0.0004 (from 0.0064) - (trial 17) - FAILED on pass 1, point 1

  • Set max Z velocity to 20 and max acceleration to 50 (trial 18) - FAILED on pass 2, point 1

  • Set Z to 25 max V, 75 max A (trial 19) - FAILED on pass 2, point 1

  • Set Z to 5 max V, 10 max A (trial 20) - FAILED on pass 1, point 1

  • Set Z to 2 max V, 2 max A (trial 21) - FAILED on pass 1, point 1

  • Now tried PROBE_ACCURACY after G28 (trial 22) - 3x PROBE_ACCURACY worked. First Z_TILT_ADJUST FAILED on pass 1, point 1

  • Set Z to 10 max V, 30 max A; Rebooted (where I started the day; trial 22) - FAILED on pass 2, point 1

Reverted to e1f7748. Rebooted (trial 23) - PASSED (0.003 after two passes)

I’m going to stick on e1f7748 for a while so I can make progress on the rest of the printer, but I’m very happy and motivated to test anything relating to this topic.

Thanks. I will try to analyze the logs in the coming days. If you have a workaround (eg, git checkout e1f7748e) then I recommend sticking with that until this can be tracked down.

-Kevin

Will do. I’m definitely working/not blocked now. (TY to LordTimmey for the bisection research.)

I’m also more than happy to test any ideas, theories, or code such that will help you get to the bottom of this. (I was once competent at C/C++, now a little rusty.)

I doubt any additional log files will be helpful, but just in case, I’ve dumped all of the log files here as well, in case you benefit from compare/contrast from different runs or otherwise want a self-service way to get at them: Box

I think all the truly relevant (and “clean” where I could make them so) log files are linked in the thread above.

(And of course, Much Thanks for Klipper!)

This issue looks like it is due to a 32bit timer rollover problem caused by an omission in the new trsync homing code. Hopefully it is now fixed (commit f2421005).

Thanks.
-Kevin

A post was split to a new topic: TMC: DRV_STATUS: 40110060 s2vsb=1(LowSideShort_B!) ola=1(OpenLoad_A!)

I’ve updated to the latest klipper version (11 july 2021 origin/master).
I have the “Unable to obtain ‘trsync_state’ response” already when executing a G28 command.
Moving the motors with the STEPPER_BUZZ STEPPER command seems to be ok, see klippy.log.

In case it matters, my system is a (relatively old) laptop running debian, not a raspberry pi.
In case any debugging steps can help to identify the issue, feel free to suggest … .

klippy.log (111.8 KB)