Question about host command stream during homing

I have a question about what Klipper is doing during homing of an axis. I am looking at the command/response traffic during G28 X and I see something that doesn’t make sense after the endstop is triggered.

Here is the stepper configuration:

[stepper_x]
step_pin: PD7
dir_pin: PC5
enable_pin: PD6
endstop_pin: ^PC2
rotation_distance: 40
microsteps: 32
full_steps_per_rotation: 200
position_min: 0
position_max: 300
position_endstop: 0

Here is the beginning of the homing sequence:

request[104898760]: trsync_start {'oid': 4, 'report_clock': 108844428, 'report_ticks': 900000, 'expire_reason': 4}
request[104898760]: stepper_stop_on_trigger {'oid': 2, 'trsync_oid': 4}
request[104898760]: trsync_set_timeout {'oid': 4, 'clock': 111844428}
request[104898760]: endstop_home {'oid': 3, 'clock': 108844428, 'sample_ticks': 180, 'sample_count': 4, 'rest_ticks': 15001, 'pin_value': 1, 'trsync_oid': 4, 'trigger_reason': 1}
request[106094985]: queue_digital_out {'oid': 14, 'clock': 108870636, 'on_ticks': 1}
request[106094985]: set_next_step_dir {'oid': 2, 'dir': 0}
request[106094985]: queue_step {'oid': 2, 'interval': 108937719, 'count': 1, 'add': 0}
request[106679908]: queue_step {'oid': 2, 'interval': 48920, 'count': 2, 'add': -14923}
request[106679908]: queue_step {'oid': 2, 'interval': 27183, 'count': 3, 'add': -3119}
request[106679908]: queue_step {'oid': 2, 'interval': 19378, 'count': 4, 'add': -1247}
request[106679908]: queue_step {'oid': 2, 'interval': 15004, 'count': 59, 'add': 0}
request[106692665]: get_clock {}
response[106692665]: clock clock=106692665
response[107252360]: analog_in_state oid=12 next_clock=110852360 value=31296
response[107252360]: analog_in_state oid=17 next_clock=110852360 value=31296
request[107623106]: queue_step {'oid': 2, 'interval': 15001, 'count': 40, 'add': 0}
request[108227142]: queue_step {'oid': 2, 'interval': 15000, 'count': 40, 'add': 0}
request[108827770]: queue_step {'oid': 2, 'interval': 15000, 'count': 40, 'add

All of that makes sense. TRSYNC notifier is setup, the stepper is enabled, the move direction is set, and then a set of step commands are sent. Then, the MCU performs the move while the endstop is periodically checking the pin state.

When the endstop is hit, I see what I would expect:

response[589806154]: trsync_state oid=4 can_trigger=True trigger_reason=0 clock=589806154
request[590009910]: trsync_set_timeout {'oid': 4, 'clock': 592806154}
response[590099148]: trsync_state oid=4 can_trigger=False trigger_reason=1 clock=0
request[590099148]: trsync_trigger {'oid': 4, 'reason': 2}
response[590099148]: trsync_state oid=4 can_trigger=False trigger_reason=1 clock=0
request[590099148]: endstop_home {'oid': 3, 'clock': 0, 'sample_ticks': 0, 'sample_count': 0, 'rest_ticks': 0, 'pin_value': 0, 'trsync_oid': 0, 'trigger_reason': 0}
request[590099148]: trsync_trigger {'oid': 4, 'reason': 2}
response[590099148]: trsync_state oid=4 can_trigger=False trigger_reason=1 clock=0
request[590099148]: stepper_get_position {'oid': 2}
response[590099148]: stepper_position oid=2 pos=1073704062
request[590099148]: endstop_query_state {'oid': 3}
response[590099148]: endstop_state oid=3 homing=False next_clock=590050382 pin_value=True
request[590099148]: reset_step_clock {'oid': 2, 'clock': 0}

This also makes sense.

However, right after that, Klipper send another set of step commands also in the negative direction without setting up the TRSYNC.

request[590099148]: set_next_step_dir {'oid': 2, 'dir': 0}
request[590099148]: queue_step {'oid': 2, 'interval': 594922958, 'count': 1, 'add': 0}
request[590099148]: queue_step {'oid': 2, 'interval': 48919, 'count': 2, 'add': -14921}
request[590099148]: queue_step {'oid': 2, 'interval': 27183, 'count': 3, 'add': -3119}
request[590099148]: queue_step {'oid': 2, 'interval': 19378, 'count': 4, 'add': -1247}
request[590099148]: queue_step {'oid': 2, 'interval': 15000, 'count': 419, 'add': 0}
request[590099148]: queue_step {'oid': 2, 'interval': 15000, 'count': 400, 'add': 0}
request[590099148]: queue_step {'oid': 2, 'interval': 15000, 'count': 400, 'add': 0}
request[590099148]: queue_step {'oid': 2, 'interval': 15000, 'count': 400, 'add': 0}
request[590099148]: queue_step {'oid': 2, 'interval': 15000, 'count': 400, 'add': 0}
request[590111932]: queue_step {'oid': 2, 'interval': 15000, 'count': 400, 'add': 0}

What is going on here? My expectation would be that after the endstop trigger, Klipper retracts, which would have to be in the positive direction. Yet, the next step direction is still in the negative direction. This sequence of step comands continues without Klipper checking the stepper position.

Answering my own question:

The tool that I wrote to simulate the MCU did not return the correct trsync_state responses causing Klipper to go into a funny state.

Even with the corrected responses, Klipper is still sending an infinite stream of queue_step commands. I’ve corrected my tool and now the generated responses are:

request[731405522]: queue_step {'oid': 2, 'interval': 15000, 'count': 40, 'add': 0}
request[731405522]: trsync_set_timeout {'oid': 4, 'clock': 734329498}
response[731710209]: analog_in_state oid=12 next_clock=735310209 value=31296
response[731710209]: analog_in_state oid=17 next_clock=735310209 value=31296
request[732003245]: queue_step {'oid': 2, 'interval': 15000, 'count': 40, 'add': 0}
response[732231912]: trsync_state oid=4 can_trigger=True trigger_reason=0 clock=732231912
request[732308720]: trsync_set_timeout {'oid': 4, 'clock': 735231912}
request[732601582]: queue_step {'oid': 2, 'interval': 15000, 'count': 40, 'add': 0}
response[732957668]: trsync_state oid=4 can_trigger=False trigger_reason=1 clock=732957668
request[732957668]: trsync_trigger {'oid': 4, 'reason': 2}
response[732957668]: trsync_state oid=4 can_trigger=False trigger_reason=1 clock=0
request[732957668]: endstop_home {'oid': 3, 'clock': 0, 'sample_ticks': 0, 'sample_count': 0, 'rest_ticks': 0, 'pin_value': 0, 'trsync_oid': 0, 'trigger_reason': 0}
request[732957668]: trsync_trigger {'oid': 4, 'reason': 2}
response[732957668]: trsync_state oid=4 can_trigger=False trigger_reason=1 clock=0
request[732957668]: stepper_get_position {'oid': 2}
response[732957668]: stepper_position oid=2 pos=1073693822
request[732957668]: endstop_query_state {'oid': 3}
response[732957668]: endstop_state oid=3 homing=False next_clock=732909013 pin_value=True
request[732957668]: reset_step_clock {'oid': 2, 'clock': 0}

which seem to be exactly what the MCU firmware code is also doing.

@koconnor do you mind taking a look? I am a bit stuck on my project because of this issue. Thank you!

I’m only able to answer questions about the unmodified upstream Klipper code. I can’t help once modifications to that code are made.

As you’ve described, Klipper homes by configuring the endstop and trsync to halt the stepper on a trigger, it then sends a series of stepper movements, then on a trigger event it’ll reset its internal state (stop sending new steps, calculate where the toolhead triggered and where the toolhead stopped) and then reset the stepper motor mcu state.

Cheers,
-Kevin

This is Klipper without modifications (except to print the messages being sent by the serial queue so I can see the communication).

It turned out that the reason for the issue was an incorrect stepper position value returned by the stepper_get_position command.

Now, it is trying to home but in some of the attempts, Klipper errors out because the trsync_state response returns a clock value that is past home_end_clock, despite Klipper extending the expire timeout with trsync_set_timeout:

(This is the command/response dump from the “MCU” side)

response[1111698199]: trsync_state oid=10 can_trigger=1 trigger_reason=0 clock=1111698199
request[1111749202]: trsync_set_timeout {'oid': 10, 'clock': 1114698199}
request[1111749202]: trsync_trigger {'oid': 10, 'reason': 3}
stepperZ[8] total step count for move: 4751, queued: 11474
response[1111761961]: trsync_state oid=10 can_trigger=0 trigger_reason=3 clock=1111761961
response[1111761961]: trsync_state oid=10 can_trigger=0 trigger_reason=3 clock=0
request[1111761961]: trsync_trigger {'oid': 10, 'reason': 2}
response[1111761961]: trsync_state oid=10 can_trigger=0 trigger_reason=3 clock=0
request[1111761961]: endstop_home {'oid': 9, 'clock': 0, 'sample_ticks': 0, 'sample_count': 0, 'rest_ticks': 0, 'pin_value': 0, 'trsync_oid': 0, 'trigger_reason': 0}
request[1111761961]: trsync_trigger {'oid': 10, 'reason': 2}
response[1111761961]: trsync_state oid=10 can_trigger=0 trigger_reason=3 clock=0

Now, I trying to understand how the home_end_clock is computed.