Load Cell Probing Algorithm Testing

leadinglights · August 14, 2023, 10:11am

@D4SK I am very interested in your results and would like to know more. In particular, I would like to see an indication of the scales on your graphs - how many milliseconds for the X axis and how much deflection/load/ADC count for the Y axis.

I have put a picture of a set of my own plots of three underbed sensors below. There are seven plots superimposed on a representation of my print bed (hexagon pattern) Each plot is placed approximately over the point of nozzle contact where the respective plot was recorded.

A particular point to note is that, for any value of Y (ADC Count) there can be peaks that may be hit or may be missed, and that this in turn will give a poor bed mesh.

It was this test that caused me to abandon underbed sensing as having feet of clay - I had previously regarded underbed piezo sensing as “best in class”

I can’t agree with you about the print bed acting like a mass between two springs consisting of the print bed mounts. In almost any case, the springiness is mostly in the print bed itself. Where there are three underbed mount/sensors on a square bed the unsupported corners will have substantial movement.

In the drawing below, the total compliance of all of the parts from the nozzle to the sensor (piezo disk) is much less than the compliance of the plate (6mm 6082 T6 aluminium in my case) One noteworthy point in this drawing is that the heavier and more rigid the plate is, the more pronounced is the effect of the centre of mass - and the greater the cancellation of the two piezo voltages.

A bit more info on this at Report on under bed sensor problems

EDIT: I have changed the first diagram as it was too poor a resolution to read the text

Mike

leadinglights · August 14, 2023, 10:23am

@garethky, Quote

I don’t have an Oracle to tell me the exact contact time.

I have used a clean nozzle or dummy nozzle and a contact plate fixed firmly to the bed. These should be wiped before each contact and can be silver-plated to get the best consistency. Even with a piece of 0.4mm copperclad FR4 and a brass dummy nozzle, I could get a repeatable contact point to well better than 1 micron.

At the risk of being chastised by the forum bot for repeating a link, you can find more detail at Report on under bed sensor problems

Mike

D4SK · August 14, 2023, 1:32pm

@leadinglights

I guess my point about the inertia of the printbed mostly applies when probing in the center, and not for all printers. I think there might be other factors contributin nonlinear behaviour, but it explains the sharp elbow that garethky is seeing on his printer.

The zoomed in plot shows seconds in the x axis, I think the deflection is 0.3-0.5 mm, 1.8mm/s.

I think for under bed sensors load cells, and relatively slow probing will work, but piezos may be much more tricky.

@garethky anything below 10um deviation will be perfect for real world printing, I think its just important that it is robust, and works with a bit of molten filament on the nozzle as well. Generally it’d be good to use as many samples as possible so it works with noisier setups as well.

Using the retraction graph as well could be nice although I have seen a slight drift after the load cell has been under tension. Maybe this won’t affect repeatability if the retraction move starts at a consistent time.

garethky · August 15, 2023, 5:26am

I see that level of precision in some results, but not consistently. My worst results are 35um. On my printer we should better than 10um.

I would be very happy if we could detect that and just call a script you configure to clean the nozzle. I haven’t figured out how yet. But I was thinking about the elbow finder tonight and using the typical time from 50g back to the collision to try and guess if an elbow is just extraordinary noise or a partial reading of a collision. If I knew the collision was good, then I could exclude points that are too far away from the trigger time. But maybe also if the elbow is more than 2 samples away from the expected location of the elbow given the time it crosses the 50 gram mark we could say that the collision is rejectable as too soft?

I have an idea of how to extract some better data from the 50 sample set but its complicated enough that I cant write it down in code just yet.

A hint came when I subtracted 1 from all of the elbow index values I hand selected and plotted the force at the elbow:

You would expect that to be a flat plot, because the point before the elbow should contain no force, just noise. But that plot has a clear rise at the end. So I’ve selected some elbows that are too late. Maybe the last 13 or so? So if I could “roll” the set to the left on the timeline I’d probably have a better picture of the true Z coordinate. Question is, could I do that automatically?

garethky · August 17, 2023, 8:11am

I had an error in my math for calculating absolute z positions. This resulted in things looking much worse than they actually were. I was trying to work out what the absolute Z position of a particular time was with:

    delta_time = trigger_time - t0
    return trigger_z - (delta_time * speed)

It should have been this:

    delta_time = trigger_time - t0
    return trigger_z + (delta_time * speed)

t0 is before trigger time, resulting in a positive value for the delta. The z value decreases as you go right on the time axis. i.e. as the machine is moving towards z=0 as it probes. We want to move away from z=0, so + not -.

I figured this out because I worked on a new way to visualize whats going on in all 50 probes simultaneously. When I saw the output I couldn’t make any sense of how tightly correlated the collision time estimates were. Now that the math is correct the discontinuities also disappeared:

Screenshot 2023-08-17 at 12.31.00 AM

The dot markers are the selected elbow points, these were selected by Kneedle.
The | markers are the selected collision time, using the force ratio algorithm.
The x marks the sample that triggered.
Each plot is shifted downwards so that they don’t all sit on top of each other. So force axis is force, its just that the absolute values are not true.
The plots are aligned on the x axis such that time 0.000 is the same Z coordinate for all of them. So they are correlated in Z space.
The probes are sorted by the z coordinate of the elbow. Farthest to the right being the smallest value to largest at the bottom.

This graph looks correct to me. The fact that the | marks are more or less aligned vertically is evidence that the algorithm is working.

To get this on the machine you would have to train it with 50 probes so it learns the range of force expected at the elbow. This is only 3x worse than the switch probe.

I need to move to testing this with higher sample rates to see if these results still hold up or if we get round elbows and fuzzy results.

need sleep , have work in the AM.

garethky · August 18, 2023, 9:58pm

I collected 50 samples at 2mm/s / 400SPS:

The last 3 probes in the batch have badly selected elbows. Without those its just as repeatable as the switch probe, 0.001mm. Some more work on elbow selection should pay off.

This technique is so effective I’m ready to give up on extracting any information from after the elbow. (I checked with linear regression and it actually gets worse with more data points: 0.007 @ 5mm/s to 0.01mm @ 2mm/s)

I have so many follow up questions and things to try. I’d like to validate this on the printer and see how it works at different bed locations.

garethky · August 19, 2023, 6:11pm

One thing that concerns me is the force range up to the elbow varying significantly across the bed. I could be over-fit to one point on the bed. So I need to try this at 9 locations across the bed (corners, edges, center) to see if it’s consistent.

garethky · August 21, 2023, 1:41am

Well, its not as consistent as you might wish for. The timing for the reporting of the triggering sample seems to be better overall. Max force changer over the 9 test points varies from 11g to 18g.

Here is a case were it seems to improved a particularly messy set of data:
// probe accuracy results: maximum 1.519965, minimum 1.507569, range 0.012396, average 1.515408, median 1.516719, standard deviation 0.003436
I brought that down to 0.00345mm of range:
Screenshot 2023-08-20 at 6.05.32 PM

But in the average case its added 0.001mm to the range. In the worst case it added 0.006.

Here is a git gist: probes_2mm_400sps.ipynb · GitHub

garethky · August 22, 2023, 6:54pm

I did a deep dive on what Prusa is doing in InterpolateFinalZCoordinate:

    float InterpolateFinalZCoordinate(Features &features) {
        float zDecompressionEnd = features.riseLine.GetY(features.decompressionEndTime);
        float loadAtDecompressionEnd = features.decompressionLine.GetY(features.decompressionEndTime);
        float middleTimestamp = features.decompressionLine.GetTime(loadAtDecompressionEnd - 120 + 70);
        float zDecompressionMiddle = features.riseLine.GetY(middleTimestamp);
        return (zDecompressionEnd + zDecompressionMiddle) / 2;
    }

They are working out the time at the elbow and the time at -50g from the elbow and averaging the two. The elbow time is taken directly from the elbow sample. The -50g time is taken from a linear regression line that has all points before the elbow. This seems weird, they have the linear regression line but they are not solving for y=0. Maybe they have overshoot as the toolhead lifts up and they are compensating for this? Maybe this is some sort of extra “squish” factor? In my testing it seems to move the collision time by about 0.02-0.03mm. It also slightly improves the range over linear regression to y=0.

Whats more interesting is the dedicated move they have for collecting the riseLine/decompressionLine data. Its only 0.09mm long at 0.33mm/s. I think you can see it in some of the videos, the toolhead looks like it pauses and I thought it was a hiccup in their movement queue. But this is where they are waiting for data from the toolhead boards and doing all the computation causing the pause. The move takes 270ms and the pause is another 140ms on the MK4, 20ms longer for canbus communication delay on the XL. No one is going to sit still for their printer to do an entire probe at 0.33mm/s. But because this move is short its hardly noticeable.

0.33mm/s / 320sps = 0.001mm per sample of resolution

I’m going to try and replicate this in klipper. I’m just going to hard code the move into the endstop for now. If that works well and the data I collect improves things we can discuss where the movement code should go (homing.py?). I can pick the movement speed such that it always results in 0.001mm per sample with any sensor chip. On the HX711 you would be looking at 0.08mm/s and a glacially slow 1.1s move time. The 0.09mm is going to be related to the overshoot. Its going to be less at 2mm/s vs 10mm/s. Maybe there is a way to scale that length value based on the homing speed to save some time.

Its possible this technique would improve switch probes as well. You could just invert the trigger condition of the endstop and do a slow move back until it triggered. Other probe technology might not benefit due to hysteresis.

D4SK · August 23, 2023, 12:41am

Maybe they measured a -50g force/deflection during printing of the first layer

garethky · August 23, 2023, 1:42am

Interesting idea. So this is compensating for plastic pushing back on the nozzle on layer 1?

For them -force is a collision with the bed and +force is extrusion force. The average of the elbow and -50g should be -25g.

Now that I know what I’m supposed to be looking at I can see it moving slower on the way back up. I don’t see any real overshoot.

I count maybe 8 samples between -50g and a positive reading, so -25g would put the average at about -0.004mm.

garethky · August 23, 2023, 4:53am

My hack was not righteous, when I call toolhead.wait_moves() it throws an exception.

b'stepcompress o=18 i=0 c=2 a=0: Invalid sequence'
b'stepcompress o=18 i=0 c=2 a=0: Invalid sequence'
b'stepcompress o=18 i=0 c=2 a=0: Invalid sequence'

Looks like the stepcompress code is angry because (!move.interval && !move.add && move.count > 1), the move interval is 0, move.add is 0 and move count is 2.

I’m doing this:

toolhead.manual_move([None, None, 0.09, None], 24.)
toolhead.wait_moves()

Which seems sane? G1 z0.09 F24 certainly works just fine. My guess is this must somehow be due to drip_moves()? That or I have to call toolhead.set_position() before the next move? There be

D4SK · August 23, 2023, 8:44pm

Great find, using the retraction really seems like the way to go. There is no collision dynamics, and any filament blob is already compressed.

garethky · August 23, 2023, 9:56pm

Success! I had to put the code in homing.py after it sets the current toolhead position to haltpos.

Weirdly similar to their plot

Screenshot 2023-08-23 at 2.30.14 PM

You see that artifact with the red circle around it, looks like the force levels off for 1 sample? Its in their plot!!?? Its in all of mine too! I have no idea what that is.

edit: pretty sure its an artifact of the 60Hz noise. They used a order 1 filter to try and eliminate their 50/60Hz power line noise. But I have access to more powerful filtering in SciPy, specifically a Notch filter and bi-directional filtering:

I tried to get this notch filter stuff to work before and had no luck. You can see over on the left its not as effective and that’s all I ever saw.

garethky · August 25, 2023, 9:17pm

I think I finally understand this filtering stuff enough to get a usable results. Before now I had failed to get results that improved the situation and I decided to come back to it later.

Here is what I have settled on, it generates two notch filters and combines them so they can be applied in one filtering operation:

import scipy.signal as signal

# filter design for 50Hz and 60Hz power noise
sample_frequency = 400
quality = 2.
uk_power = 50.0
us_power = 60.0
b1, a1 = signal.iirnotch(uk_power, quality, sample_frequency)
b2, a2 = signal.iirnotch(us_power, quality, sample_frequency)
b = signal.convolve(b1, b2)
a = signal.convolve(a1, a2)
power_notch_filter = signal.tf2sos(b, a)

# filter application
filtered_force = signal.sosfiltfilt(power_notch_filter, force)

Results:

This is the best tradeoff that I have found so far. It has enough filtering power to convincingly eliminate the power noise, but it creates the smallest distortions at the elbows. It covers 50Hz and 60Hz power so the user doesn’t have to tell us what the power frequency is in their country. Using filtfilt does two passes on the signal and eliminates the shift on the time axis usually associated with filtering. This does add a small distortion at the initial collision but none is apparent at the elbow of the pullback move.

The key to good results was setting the quality to a low value. The lower the value the wider the bandwidth of the notch. We have data that’s not captured at exactly 400Hz (because its polled) and the power frequency can fluctuate slightly. So our data is messy and a wider notch does a more consistent job. Quality = 1.0 produces noticeable time delays at the elbows, but 2.0 seems to eliminate this. Higher settings add more wobble before the initial collision.

I tried a bandstop Butterworth filter as an aternative but it creates larger distortions at the elbows.

Here is a git gist with example graphs of all of the things I have tried: Filter Investigation.ipynb · GitHub

Python/Scipy FTW

Interestingly, if you try the Butterworth highpass filter settings that are in the comment in Prusa’s code you get nothing like their actual published plots. It totally filters out the collision because its at such a low frequency. It doesn’t do a good job of filtering power noise either. Maybe the C implementation works differently? This was one of the things that had me really lost originally:

garethky · August 28, 2023, 5:29am

I’m stumped. The pullback move works but the position reported for the toolhead after it retracts keeps increasing. It goes up by 0.03 to 0.04. I didn’t think that was possible because the move happens after toolhead.set_position(haltpos). So it should just work like a regular move, just like the probe retraction which is the next thing that happens.

The probe retracts relative to the trigger position. Near as I can tell the trigger position impacts nothing when probing (its only critical when homing a rail). The pullback move is 0.09mm. The trigger position and halt position are only about 0.01mm apart. I cant get 0.04 from those numbers. I have clearly missed something.

Here is a video of the pullback move in action:

Sineos · August 28, 2023, 6:12am

FWIW, there are reports that this happened with probing before, albeit in the opposite direction: Probe accuracy is drifting all of a sudden

No idea if it is related, but maybe some undiscovered underlying issue that can lead to such behavior.

garethky · August 28, 2023, 4:22pm

It seems to be speed related. Doing a longer move, say 0.5mm still causes the issue. Increase the speed to 2.0mm/s and the issue goes away. I tried a few speeds:

2.0mm/s - OK
1.0mm/s - OK
0.75mm/s - OK
0.5mm/s - FAIL
0.4mm/s - FAIL

Update:

I made a list of probable causes and found it. The Z stepper was in stealthchop mode, stealthchop_threshold = 1. Removing that setting from the config fixed it. That means it was a true positional inaccuracy caused by the stepper drivers, nothing to do with the code.

I guess somebody is going to have fun with that information.

Update #2:
stealthchop_threshold = 1 would put the machine in stealthchop only up to 1mm/s. (I wrote this config maybe 2 years ago now, I guess I was trying to turn it off completely?). The doc notes that if the driver switches modes while moving it may produce “confusing results”. But at 0.4mm/s it should have stayed in Stealthchop for the whole move. And we don’t expect it to loose steps if it stays in one mode or the other.

Either way, not my problem to solve. I’ll get dumps of some probes with the pullback move and get a number for range.

garethky · August 29, 2023, 4:22am

Ok, it looks promising: I’m getting 0.003mm of range over 50 probes using linear regression.

Just using the endstop trigger time at 2mm/s the range is 0.0076mm. That’s the bar, any worse than that number and I wasted my time.
Picking an elbow from the pullback move doesn’t really improve on that: 0.00697mm
Applying the power filter and picking an elbow does: 0.00496mm (34% better) That’s still 5 samples wide. That means there is no “critical sample”.
Using linear regression based on that elbow pick reduces it further to 0.00305mm (60% better)

So its not quite a switch probe, but this is a lot tighter than that endstop based number:

A single collision analysis looks like this:

I’m actually getting both elbows in 1 pass with Kneedle.

Since the elbows are more rounded and noisy I’m doing what Prusa is doing and discarding 3 points either side of the elbow before doing linear regression.

The power filter helps, but after going through linear regression its not as much as you might expect. I’ll likely include the filter, but make the code optional so no one has to install SciPy.

Despite the 5x increase in sampling resolution we don’t see a 5x improvement in range vs the endstop (we could have expected results like 0.0015). Still I suspect this will be more consistent across the bed based on probing a few test points. I need to do the whole bed next…

D4SK · August 30, 2023, 3:51pm

Have you tried increasing the force threshold?
Maybe the regression should be over a minimum amount of 50hz periods, lets say 8 (160ms)
In this case the power line noise should not affect the result of the regression less.

Also the start (high force) side of the included values now contains the accceleration phase, which could be noisy, and have some ringing.

Just did some testing with my old implementation @ 1.5mm/s

sd is usually 20-30um with the collision start finding algorithm wich ends up including 7-10 samples in the regression

if i lock it to use 11 samples (with the known end of the probing move)
sd is 8-20um

This is worse than i expected. It explains the usable, but not great performance for regular printing though.

It also shows that my idea to dynamically set the start of the included samples based on noise conditions, or filament blobs doesn’t work due to the nonlinear force graph. So locking the amount of samples increases accuracy.

Topic		Replies	Views
Testing request: load cell homing/probing + hx711 adc #6018 Developers	8	1563	April 11, 2023
Probe offset for non cartesian kinematics Developers	2	388	September 18, 2021
TEST_RESONANCES on Z-Axis Features	8	1116	June 22, 2024
Strain Gauge/Load Cell based Endstops Developers	511	38970	June 12, 2025
Capture Z Position and Acceleration Developers	4	540	December 7, 2023

Load Cell Probing Algorithm Testing

Related topics