Suggestion for improving PROBE_CALIBRATE usability

Maybe I’m just lucky. :man_shrugging:

Maybe! :slight_smile:

To be fair, i’m not saying everyone is definitely gonna dent their bed or bend their heatbreak if we don’t do something, only that it has happened too many times, when it could have been prevented (or the damage minimized at least), and it will happen again in the future.

As i described in my last comment the problem is multi-faceted, the potential for damaging hardware is just one out of several concerns.

1 Like

I’ve crashed my print head into my bed more times than I can recall, but I’m not sure the blame could be attributed to me leaving this setting at -5mm.

I think just general stupidity is to blame :rofl:

1 Like

Ideally it should stop and throw errors at Z0, avoiding bad crashes. With position_endstop: -5 it won’t complain until it’s gone 5mm too far. It’s not unreasonable to assume you would’ve been less unfortunate without it in your config.

1 Like

TBH, most of my crashes have been due to a Bed probe mount coming loose - no setting is going to save you in that scenario.

Thanks. I think I understand your proposal now, and I think I understand the source of the earlier confusion.

You’ve described a coordinate system that is relative to the bed. However, I look at the system as having two separate coordinate systems - a physical coordinate system defined by the rails/carriages and a “virtual g-code coordinate system” that resides within it.


In the above picture, the grey squares around the outside represent the “physical coordinate system”, while the plane and red frame represent the “virtual g-code coordinate system”.

When viewed from the physical coordinate system, the g-code coordinate system is bumpy (bed_mesh), skewed (skew_correction), tilted (bed_tilt), and changing (z_thermal_adjust). Interestingly, though, if one were viewing the physical coordinate system from the perspective of the g-code coordinate system, then the g-code coordinate system would seem perfectly rectangular while the physical coordinate system would appear bumpy, skewed, tilted, and changing.

For what it is worth, your earlier messages were describing goals relative to the g-code coordinate system using terminology I associate with the physical coordinate system, and that had me confused.

In any case, I have some high-level suggestions:

  1. If you pursue this work, I recommend you try to find a way to accomplish your goals without changing the kinematic specific code. It’ll be very challenging to implement changes to all the different kinematics and then test them all with two different types of homing hardware. It would also make it harder to add new kinematics in the future (as it is already a challenge to add new kinematics, and testing two different types of homing hardware would add to that challenge).
  2. Consider that if your current proposal was implemented and a user issued a G1 Z-2 they would still have a high chance of damaging their bed/nozzle. This may seem odd when viewing the system from the “g-code coordinate system” perspective. However, I think the reasons are a little more clear when thinking in the physical coordinate system. Consider the case of a tilted bed where one side is 3mm lower than the other and a bed_mesh has been used to compensate for that. Should a user traverse to to the low side and issue a G1 Z0 then the bed_mesh code could translate that to a physical z=0. Should the user then traverse along the XY axis to the high side of the bed, the bed_mesh code will translate that to a physical z=3. Should the user then issue a G1 Z-2, the bed_mesh will translate that to a physical z=1, it will pass the proposed position_min checks, and move the nozzle 2mm into the bed.
  3. As a suggestion, if you want to protect the nozzle/bed, then it may be necessary to implement those checks on a coordinate system that is relative to the bed. It seems challenging to me to implement the checks on a coordinate system that is relative to the frame. The kinematics code and the calibration tools (eg, BED_MESH_CALIBRATE, TEST_RESONANCES, QUAD_GANTRY_LEVEL) are all implemented on the “physical coordinate system”. I’d argue that these tools really need to be implemented using the physical coordinate system.
  4. Consider that it may ultimately be more productive to alert designers and “power users” to the “two coordinate systems” and that maybe we need to do a better job distinguishing between them. Sure, it can be super confusing for a casual user, but ultimately we may want to strive for a state where casual users don’t need to modify their printer.cfg.

Finally, as a request, it would help me if you could give some further context on the failure cases. So far we’ve used G1 Z-5 as a failure scenario, but I’ve never seen a user accidentally issue a command like that. So, it would help me if I could better understand the real world scenarios that are leading to collisions. I also don’t understand the references to position_min: -5 - that seems to me to be an attempt to “completely disable the safety check”. So, sure, if one completely disables the check then one may then need to reenable it, but perhaps we’d be better off not recommending they completely disable it in the first place? So, what situations are requiring such an extreme 5mm delta on the Z carriage (relative to the z endstop)?

Cheers,
-Kevin

1 Like

Yes, that’s how i look at it too, but i’m trying to explain what it looks like to the user. You and i know that position_min/max/endstop defines the physical limits, the virtual g-code coordinate system runs within those boundaries and you can translate (SET_GCODE_OFFSET) and transform how the virtual coordinate system maps to the physical one by using modules such as bed_mesh, skew, z_thermal_adjust, and the likes. Ie. the virtual coordinate system is a runtime mutable transform. Explaining the basic concept about virtual vs physical to the user is not difficult, it’s all the “side effects” associated with it. It’s not always transparent to the user how the virtual system is offset and/or transformed to the physical coordinate system which enforces the limits. It’s not immediately apparent when, why and how the boundary between the physical and virtual systems is crossed. They’re mixed more often than not, in both directions (z_offset is the simplest example). When you’re using a probe as a virtual endstop for an axis, the physical boundaries should be derived from the virtual one, until we have a complete physical representation of the system. Currently that’s not the case. The user is required to describe the physical system first, even though that’s not actually possible to do in any meaningful way (hence the position_min: -5 we keep coming back to). Without knowing the distance between the probes trigger point and the nozzle (z_offset) you don’t know the position_endstop and thus position_min is not an accurate representation of anything. All we really know and the most accurate definition we can give to the physical coordinate system, is that position_min is somewhere between negative infinity and position_max, until the z_offset has been determined - physically - and “saved” (apropos crossing the boundary between virtual and physical), position_min and position_endstops are in actuality unknown. That is the crux of the issue, in technical terms. A chicken and egg kind of situation.

For what it is worth, your earlier messages were describing goals relative to the g-code coordinate system using terminology I associate with the physical coordinate system, and that had me confused.

I understand, there’s just no position_min / position_max equivalent in the g-code coordinate system, what would you use instead?

Btw i think this part

if one were viewing the physical coordinate system from the perspective of the g-code coordinate system, then the g-code coordinate system would seem perfectly rectangular while the physical coordinate system would appear bumpy, skewed, tilted, and changing.

Is a good way to think of the users perspective - and why many of them get frustrated as they don’t understand how they relate to each other. It’s a pretty complicated opaque mathematical model requiring multiple variables on a calibrated printer. I think going forward, thinking of the user as living in “gcode-land” could be helpful.

  1. Good point.
  2. I don’t understand. These situations aren’t affected by my proposed changes, are they? My proposal disables the physical boundary check at the endstop side of a single axis, exclusively during probe calibration, and only if that probe is used as a virtual endstop on that axis. Is there are mistake in my proposal somewhere that does not confine it to that specific situation? Does it help to define a flag that gets set when PROBE_CALIBRATE and ACCEPT is called? I must be missing something.
    I’m also confused how the scenario you’re describing is different from how the system works today. Except that G1 Z-2 wouldn’t be allowed because the user forgot an arbitrary position_min: -5 in his config that we forced them to add, for the only purpose of running a calibration routine, it would have to be explicitly set for another reason - for example to get bed_mesh to work (ie, the minimum Z value of the bed mesh has to be greater than position_min).
  3. I agree with your sentiments, but it’s also far outside the scope of issues i’m trying to solve. I’m not trying to prevent hardware damage in general - i’m trying to remove one source of it (the requirement of setting an arbitrary position_min that conflicts with the physical properties of the printer, to run PROBE_CALIBRATE), while simultaneously improving the user experience of PROBE_CALIBRATE.
  4. I agree with the sentiments here, we absolutely should look into what can be done. However, it doesn’t actually solve any of the points i laid out in my description of “the problem” in my post.

So far we’ve used G1 Z-5 as a failure scenario, but I’ve never seen a user accidentally issue a command like that.

Doesn’t have to be directly user initiated, any macro that moves relatively can cause this, any macro that relies on position_min as a variable can cause it. Clicking one too many times on the z button in a frontend can cause it. Forcing an incorrect definition of the physical properties of the printer leaves any option open really - you’re effectively disabling the kinematic check. G1 Z-1 may be enough in itself to cause damage. Consider an all metal toolhead, unsupported heatbreak and a very flat and rigid bed.

that seems to me to be an attempt to “completely disable the safety check”

Bingo! yet PROBE_CALIBRATE requires it, unless you want to run it 4 times on some probe setups (euclid on EVA3 is one example).

I also don’t understand the references to position_min: -5

It’s the current default i use because it encompasses the z_offset of both bltouch and 8-12mm inductive probes. However it is not enough to calibrate a euclid or a klicky probe on the toolhead designs i’ve seen - there may be some out there where it’s not an issue. Default z_offset is zero (assuming anything else is dangerous). PROBE_CALIBRATE should be used to find the truth. Once z_offset is calibrated, ideally position_min should be set to zero (but we can’t for other reasons - bed_mesh and so on, it could however be set lower - but the klipper docs don’t tell us to). You can read it as position_min: -2 since that’s what’s in the docs.

Do you see how the boundary between physical and virtual becomes fuzzy as it relates to a non-flat surface (that you inherently think of as physical) and an undetermined/intangible virtual endstop position?

So, sure, if one completely disables the check then one may then need to reenable it, but perhaps we’d be better off not recommending they completely disable it in the first place?

Completely agree, hence my proposal! :smiley: to be absolutely clear, i never once suggested disabling the check in general (only in the very situation where it’s invalid in the first place - PROBE_CALIBRATE on a virtual endstop), i do however claim that it’s effectively the result of what the Klipper docs are currently doing:

So, what situations are requiring such an extreme 5mm delta on the Z carriage (relative to the z endstop)?

I think this is why we’re talking past eachother a bit. There is no z endstop. There’s only an uncalibrated probe. There is nothing physically determined to relate to. That’s the specific case i’m talking about (i haven’t personally dealt with printers that function differently for the past 4 years).

I’d like to comment as a user.The only time I have damaged my new Xmax-3 has been because I’ve been distracted at fixing some other error in OpenSCAD. and left the bed plate on the kitchen sink cooling after removing the failed first layer. fixing the code, usually in OpenSCAD then doing all the other stuff to restart a new print with the same name, meaning I have to clear the “job” listing AND the “.cache” listing in order to assure its using the newly sliced version, restarting the print and forgot the bed plate was still cooling on the sink. The initial g28 then drives the nozzle into the magnetic sheet, causing a small mountain I have to shave off before I can get the next valid bed mesh.
I would propose that if a previous bed mesh is available, the min reached by that previous mesh be used as the effective limit for the next probe_calibrate, possibly with an additional .05. With the automatic motor powerdown now commonly used, that might not be possible. That could be solved by stepper/servo motors as they can be left powered up, while only generating 5% of their run heat when sitting idle… But while that works very well, I’m going that on my cnc machines in the garage and often can’t feel that the motor is powered by checking its temp until I try to move an axis by hand and cannot. This potential loss of home can be disabled by disconnecting the drivers enable line since the default is on.
This also disables the dreaded layer shift as those drivers will hit the motor with everything the psu has, for about .001 seconds if they hit an obstruction. If that occurs, the driver shuts the motor off and notifys linuxcnc, stopping everything its it tracks in the next .001 seconds by removing all motor power. That powerdown reset is a hard requirement to re-enabling these drivers. Well tested by linuxcnc, but in 2 years has not actually happened while running gcode. I’m mid-process of doing this to 2 other even bigger printers XY’s. right now. Thanks for reading this far. Take care, stay warm. dry and well.

I agree that is a challenge. But, maybe it would be simpler if we tell users to leave position_min: 0 and start with a z_offset: 25? That is, if we are homing with the probe, we don’t know the z_offset, we can’t find the z_offset with position_min=0 and z_offset=0, then we can temporarily increase z_offset so that we can descend enough to determine the actual z_offset?

Okay - I missed that. I did not realize you were only interested in this very limited use case.

-Kevin

I agree that is a challenge. But, maybe it would be simpler if we tell users to leave position_min: 0 and start with a z_offset: 25 ? That is, if we are homing with the probe, we don’t know the z_offset, we can’t find the z_offset with position_min=0 and z_offset=0, then we can temporarily increase z_offset so that we can descend enough to determine the actual z_offset?

I’d prefer it wasn’t required to edit the config and add psuedo-values at all to run PROBE_CALIBRATE in the first place. z_offset: 25 would crash during homing with any homing routine that uses a z_hop where the sum of z_hop and the “real” z_offset is less than 25. Seems to me like it just moves the problem to a different variable? I do agree however that it’s better than the current situation, since it’s temporary and would be resolved after a succesful PROBE_CALIBRATE.

Okay - I missed that. I did not realize you were only interested in this very limited use case.

I’m very interested in the broader discussion of the effects of the physical vs virtual coordinate systems, it’s just a much more complicated domain with no simple “solution”. But i’ll be happy to chime in, in a thread dedicated to that. I thought this one was a nicely contained problem with a confined scope that wouldn’t have any impact on the broader operation of klipper - easy to implement - easy to review, it’s a common cause of confusion/frustration for new users since it’s kind of a mandatory process when provisioning modern printers.

I understand. For what it is worth, a user would have to put something in z_offset in order to start Klipper. Admittedly, a value of 25 is too high. I also understand your point about homing macros that move to an absolute z position after a z home. If the user roughly knows the Z position, I guess they could increase the z_offset just a few mm. I do understand the concerns.

I’m not sure I have any good alternate suggestions. I’d recommend against changing all the low-level kinematic code (due to high dev/testing cost and maintenance). It’d certainly be preferable to concentrate this logic into a module (either PROBE_CALIBRATE itself or some new Z_PROBE_AS_ENDSTOP_CALIBRATE tool) - though admittedly it looks challenging to keep the low-level kinematic code unaware of this special case. I suppose tool code could be written to use the internal equivalent of SET_KINEMATIC_POSITION to bypass the kinematic checks on descending, but this wont work correctly on rotary_delta and cable_winch kinematics (where Z height changes would impact XY positions as well).

Not sure.

-Kevin

Noted, i get the concerns about the low level kinematic changes, if i think of something i’ll post here. Won’t pursue this further for now. Thank you for the feedback, Kevin!

I have a mark in my bed where i was at z=8 and fat fingered the -Z travel when the axis move increment was set to 10. This was all at the bed center and z-home position.

Sure there are arguments that klipper can’t know everything and the state of kinematic unknowns is very high. But i’d argue at the very least it should not be possible for a user to command a z crash after Z is homed (in the area where z has been homed). As an extension if you have a z-home and a valid bed mesh it should also not be possible to have a z crash anywhere within the bed mesh.

1 Like

I agree, that should be the goal. It will most likely take some substantial changes to get there though. Maybe we should start a thread about the low level physical vs virtual coordinate systems and boundaries and get that discussion started, it’s gonna be a long one!

If no one else has, i will start a thread once i’m on the other side of the current RatOS sprint, i have a few thoughts. Then we can discuss what we’d like the future to look like from high to low level and how we might possibly get there.

just as a point to consider, kevins assessment that it is a very limited use case is IMO not the real world we work in. Few if any of those who have improved their machines, either with prox switchs or bltouch and its clones ever leave the z home switch active, in fact I have subbed a prox switch for the mini-micro the bed hits. So my view is that isn’t very limited, but probably pretty universally done.

If it is left hooked up, it should be treated as a limit switch to prevent bed damage when the flex plate is still on the kitchen sink when the next print it started. I’ll plead guilty to that and have had to very carefully dress the dimple in the mag sheet to make it flat again.

My $0.02. Take care & stay well all.

Perfect example of why this needs to be fixed (timestamped): The 3D printer with no belts: The Peopoly Magneto X uses closed-loop linear motors! (youtube.com)

Peopoly’s default position_min on Z is -35 (source). God knows why it’s that low, but regardless, they should never have had to set it below 0 (or lowest expected value in bed_mesh range) in the first place.

2 Likes