Bug found with gcode_button

Basic Information:

Printer Model: A8 & Custom Printer
MCU / Printerboard: A8, and Octopus
klippy.log

Fill out above information and in all cases attach your klippy.log file (use zip to compress it, if too big). Pasting your printer.cfg is not needed
Be sure to check our “Knowledge Base” Category first. Most relevant items, e.g. error messages, are covered there

Describe your issue:


klippy.log (82.4 KB)

I think I am seeing a bug with gcode_buttons. Or, at the very least, unexpected behaviour. (I am using the buttons to fire MQTT messages, and/or via moonraker, but am just using M118 messages in this example for simplicity.

Taking the following simple gcode_button example (pin ommited for brevity)

[gcode_button test_button]
press_gcode: M118 Button Pressed
release_gcode: M118 Button Released

This works as expected when the printer is idle, or moving (e.g G0 X100 F100), i.e I can toggle the pin High and Low, and the appropriate Gcode will be displated with the button is pushed or released.

However: If I run a “G4 P10000” for example, and then push (and release) the button while, while the “Delay” is running, what happens is:

  1. Nothing happens while the G4 timer is running - no “Button Pressed” message is displayed.

  2. As soon as the timer is finished, we get a “Button Pressed” message, BUT no “Button Released”. This, Klipper is no longer in the correct state - it thinks the button is “pushed down still”, but it is not.

  3. If, now that the timer has finished, I now push down (and hold) the button, again nothing happens, the press_gcode is not called. I believe this is because Klipper incorrectly thinks the button is already pushed, as it never registered the earlier “Release…”. Thus, the appropriate “Button Push” actions never occur…

So, there seem to be three problems here:

  1. If you push a button while a G4 delay is running, the “Button Pushed” method is not fired until after the G4 actually finishes. (Perhaps this is by design?) If a pin goes high/low during a G4, isn’t it reasonable for those actions to run immediately? For example, lets say the button is an Emergency Stop button…

  2. Additionally, Klipper now thinks the button is still in the “Pushed” state, but in reality it has long since been released. But the release_gcode is never actually called at all… so expected functionality never occurs - i.e IF a G4 is running, then Klipper never calls the appropriate commands once the button/pin changes state to “Released”…

  3. Additionally, if a person (or electronic sensor) now “pushes this button”, after the G4 has finished, then the press_gcode is never fired, as Klipper thinks the button is already pushed down…

Issue 3 is especially bad - because it means the physical button will not have any action IF it is pushed AFTER it was pushed during a G4 command…

If one pushes a button, and there is no G4 or any other commands running, then one would of course assume the press_gcode actions should occur…

AFAIK this is a bug in klipper, and easily reproducible - simply create a gcode_button linked to a physical button, and push it while running a G4 command.

Can anyone please advise how to proceed from here? Or confirm this is a bug?

Thankyou.

#1 is a consequence of the way that gcode commands are processed. Klipper is not an interactive system. Manipulating the button enqueues the M118 command. That command can’t run until anything that’s in the queue in front of it runs. E.g. a G4 that might be running has to finish first. There s no “run my urgent GCode in parallel with whatever else is going on”.

This ‘gotcha’ come up a lot because this is not how Marlin or RRF handle user input. They will try to get your input out to the hardware in the next step pulse if they can.

I had a swing at fixing this a while ago because I wanted to use baby-stepping while a macro that’s printing a test pattern runs. A macro is a “gcode command” for the purposes of the queue, so it has to finish before the baby-stepping commands can run. I was not successful with my experiment.

The only truly interactive thing you can do with klipper is Emergency Stop.

#2 and #3 sounds more like bugs. The events should be coming back from the MCU to trigger further events. But there is some debounce code in there that you may be getting caught up in.

Looks like the same issue as in these bug reports: Macro for GPIO pin not releasing

As indicated above, the first item is by design - only one g-code command runs at a time - if new commands are issued (and gcode_button issues commands) then they are appended to a queue and that queue is run in order.

I don’t know why button releases would not be reported correctly. There was a similar report at QUERY_BUTTON unreliable .

-Kevin

Just something which comes to my mind:

Wich mcu is used inside the A8 Mainboard?

STM32 by any means?

As someone pointed out I experience the same problem on my (modified) Voron.
If it turns out that the problem in both cases happened in STM Controllers I might aswell turn my printer upside down once again and wire the specific pin to the LPC-Board I also have down there. That might or might not give some insights.

then they are appended to a queue and that queue is run in order.

Kevin, a question related to this – is there a reason the system behaves that way, vs like all the other firmwares out there? It doesn’t seem like an architectural limitation, rather than an arbitrary choice, that interactive GCodes are appending to the tail of the queue rather than the head. RRF processes things in a queue, as well, it just happens to differentiate between interactive and non-interactive gcodes. (ie, anything on a console input is immediate, vs things being spooled from a file)

Given the frequency of problems that crop up because of Klipper’s behavior in that regard, it seems at least making it an option to have interactive commands jump the queue makes sense.

Things like gcode_button have essentially no use if the button is a “well, when you get around to it, go ahead and do something”. Just like its weird you have to do a print-ending emergency stop if you catch something happening on the printer, vs just pausing and fixing it. Its such a huge problem with Klipper, only the lack of multi-MCU support in other firmware has kept me from switching back to something else.

Its not really different than any event queuing mechanism. The messages being sent between components in Windows, or Xwindows, are also queued, but your UIs would be sluggish and unresponsive messes if interactive ones didn’t jump the queue.

My understanding is that injected gcodes (such as from a button) are appended to the head of the queue. The issue arises when the currently-running gcode command takes some time to complete, since that’s how long it will take before the next command (i.e. the injected one) is executed. Emergency stop is the only command that will immediately interrupt any running command. So if you use a button to inject a gcode while (for example) a G4 is running, that gcode won’t be executed until after the G4 has finished. The same thing will happen if a button is pressed while a macro is running. Even though a macro is a sequence of other gcodes, the command queue sees the macro as a single command so it likewise won’t be interrupted by an “injected” gcode (other than emergency stop).

Other systems do more that just put the command at the top of the queue. They actively run commands simultaneously.

E.g. babystepping runs concurrent with the current move. The Z move is applied to the next step with no accel/decel. If you don’t do this, then a calibration print that has long slow moves would complete those moves before babysteps were applied. This is not what I expect as a user of Marlin/RRF. Putting the commands in the top of the queue doesn’t fix this.

Klippy is running ahead of where the machine actually is in time so its movement queue can actually processed before your command is even issued. This can make klipper “feel unresponsive” when you make an adjustment during a print. Largely how responsive klipper feels is down to how long the moves are that are executing at the time.

So you really need:

  • A separate user input queue that can run simultaneously with the print queue
  • Customization of many commands to run concurrent with the existing movement queue in a way that users expect (but also doesn’t break things).
  • Possibly reaching all the way to the MCU to make changes immediately.(babystepping, speed adjustments, flow multiplier etc.)

Thats very hard to implement. I’m not saying that we shouldn’t do hard things, we should. But the fix is not simply tweaking which end of a queue the commands go into.

(I very naïvely thought I could fix this once, and I was wrong about how hard its going to be)

1 Like

I was able to reproduce this issue. I have put up a proposed fix at Fix possible ordering issue if a callback blocks in button handler by KevinOConnor · Pull Request #6440 · Klipper3d/klipper · GitHub

Thanks,
-Kevin

1 Like

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.