Load Cell Probing Algorithm Testing

Also the cutoff for including samples is kind of harsh (especially for lower sample count sensors).

So we could do a weighted linear regression, where the weight (importance) of the first 10% of samples is ramped in, and the last 10% of samples are ramped out gradually.

I’m at 3-5um with this approach. Thats better than the force based endstop which is 7-10um @ 2mm/s or ~100um @ 5mm/s. You can get that same 3-5um at any speed because the pullback move is at the same speed.

I’m hoping this is “good enough” but I’m interested in lowering it further if we can.

Every other probing technology requires some additional nozzle offset measurement. E.g. the switch probe on this machine is good to 1.1um but I have to check both the nozzle offset and the switch body positions against another switch. Taking 3 measurements at 1um should add up to 3um of range. So maybe this is already just as good.

The noise filter should take care of this because its able to look at many more 50/60Hz periods than just over the collision. Noise gets removed from the entire signal.

I’m seeing +/-1um change with noise filtering. Its not always an improvement, which makes me believe that linear regression is taking care of the power fluctuations on its own. I have some other noise issues in my test rig (those wires to the load cell are long and unshielded!) which may be the root cause of this. I need to switch over to ethernet cable or something.

At 1um per sample and a pullback move length of 0.1mm, I am getting ~50 points in each line segment. I don’t think we are shy on data points.

I tried discarding different numbers of points at the elbow and start. I saw improvement up to 3 points but none really after that.

The implementation selects the speed of the pullback move based on the sample rate: sample/second * 0.001 = speed. So at 80SPS you will get a speed of 0.08mm/s. The number of samples in the pullback move will be the same for all sensors. Discarding 6 of 50 points doesn’t seem too agressive.

Maybe we should discard based on the phase of the power signal?

True, we can go a lot slower now, which i haven’t really tested. I think 0.08mm/s will be too slow though.

We could discard based the phase of the power signal, but we throw out random samples in a nonlinear part of the graph. I think ramping the weight of the weighted linear regression over 20ms does the same.
I used chatgpt to calculate this:

1 Like
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

# Generate the sine wave data
num_samples = 96
x = np.linspace(0, 12 * np.pi, num_samples)  # 6 periods
y = np.sin(x)

# Calculate the linear regression
X = x.reshape(-1, 1)
reg = LinearRegression().fit(X, y)
y_pred = reg.predict(X)

# Generate the weights for weighted linear regression
weights_ramp = np.concatenate((np.linspace(0, 1, num_samples // 6), np.ones(num_samples // 6 * 4), np.linspace(1, 0, num_samples // 6)))

# Calculate the weighted linear regression
weighted_reg = LinearRegression().fit(X, y, sample_weight=weights_ramp)
y_weighted_pred = weighted_reg.predict(X)

# Plot the results
plt.figure(figsize=(10, 8))

plt.subplot(3, 1, 1)
plt.plot(x, y, label='Sin Wave')
plt.plot(x, y_pred, label='Linear Regression')
plt.title('Sin Wave and Linear Regression')
plt.legend()

plt.subplot(3, 1, 2)
plt.plot(x, y, label='Sin Wave')
plt.plot(x, y_weighted_pred, label='Weighted Linear Regression')
plt.plot(x, weights_ramp, label='Weights')
plt.title('Sin Wave and Weighted Linear Regression')
plt.legend()

plt.subplot(3, 1, 3)
plt.plot(x, weights_ramp, label='Weights')
plt.title('Weights')
plt.legend()

plt.tight_layout()
plt.show()

I want to re-rest polyfit, we saw that had benefits when the underlying signal was non-linear.

I’ll get some data posted soon for you to play with.

The weighted linear regression performs better than standard linear regression when there is no filter applied. This seems like a good substitute for the power filter, if we can get it implemented in numpy we wouldn’t need the extra library. But its not a clear improvement over the power filtered data.

Polyfit made things slightly worse.

Here is the notebook with data:

(graphs are at the bottom of the page if you just want to look :eyes:)

I re-worked the ramp idea so you can specify the ramp-in and ramp-out as a fraction. Its also now only using numpy, polyfit with order=1 does linear regression:

def weighted_linear_regression(x, y, ramp_in = 1./6., ramp_out = 1./6.):
    num_samples = len(x)
    ramp_in_samples = int(num_samples * ramp_in)
    ramp_out_samples = int(num_samples * ramp_out)
    mid = num_samples - (ramp_in_samples + ramp_out_samples)
    weights = np.concatenate((
       np.linspace(0, 1, ramp_in_samples),
       np.ones(mid),
       np.linspace(1, 0, ramp_out_samples)))
    p = np.polyfit(x, y, 1, w=weights)
    return (p[0], p[1])

Ramping 20% near the elbow and 50% at the ends seems to work really well. This biases the linear regression to the points near the elbow.

Combining that with the power filter produces the best results. Maybe 0.5um better over the power filter alone. I have to try this out with a lot more sample data but I will be very happy if I get this down to 2.5um.

Thanks @D4SK!

I built a switch probe tool to see if the range would be worse with this offset from the toolhead:

Results:
5mm/s
// probe accuracy results: range 0.010625, standard deviation 0.002890
2mm/s
// probe accuracy results: range 0.005460, standard deviation 0.001414

So that would mean the pullback move is as good or better than the switch probe when mounted with a similar offset.

The switch impact measures ~65g of force. This is pretty close to the spec of 75g. The load cell will stop at 2-3x that depending on the setup.

(IDK why but retracting more slowly seems to have a positive impact on the range. So if I retract at 5mm/s but probe at 2mm/s I get the same results as probing at 5mm/s. Its only when I retract at 2mm/s that I get the improvement. Backlash?)

I also tried to re-wire the ADS1263 board to be on the toolhead and this made all of the electrical gremlins come back (noise, resets). I’ll work on using a longer shielded/twisted cable for the load cell instead.

I re-wired the ADS1263 board with a shorter connector and used twisted pair (cat5) cable to connect the load cell. I have a big capacitor (330uF) across the 5V / Ground leads, now with shorter wires. The result is that I cant really see the 60Hz noise anymore!

The thing that shocked me is the wiggle in the pullback move. Previously I ascribed that to 60Hz ripple, but clearly that cant be right. That has to be some real artifact of the system. I could still be electrically induced, like something to do with micro-stepping? Or it could be some physical phenomenon I haven’t considered?

With this improved electrical setup, using just the elbow finder beat everything for consistency. It got 0.00307mm of range. The other methods, both filtered and unfiltered were between 0.004 - 0.0055. That result agrees with previous experiments on the collision elbow which showed linear regression only made a good elbow section worse, on average.

1 Like

Ok, its mass results time. Each point on these graphs is the range across 50 samples taken with PROBE_ACCURACY. Lower is better.

First up a comparison of the DF2 switch probe tool vs the endstop in the load cell at 2 and 5 mm/s. The load cell endstop triggers at 50g:

The load cell is more consistent point to point but overall reports a larger range than the switch. Even at 2mm/s its just in the ballpark of the switch. At 5mm/s its above 0.01mm everywhere which is basically unusable.

Now testing with the load cell using the pullback move at 0.001mm / sample and the probe move at 2mm/s:

We get a variety of results with different methods winning out at different sample points. But the good news is basically any of the linear regression methods beat the switch (blue) and they all handily beat the endstop trigger (orange).

And with the pullback move at the same 0.001mm / sample and the probe move at 5mm/s:

Here again the load cell is mostly below the switch (blue line). For whatever reason the 5mm/s switch numbers were actually between than the 2mm/s numbers (maybe speed helps with triggering consistency?). Any of the linear regression lines here are good. Only the two elbow selection lines show significant deviation.

So overall this seems more consistent than the switch (at least this particular DF2 switch). I don’t know which exact algorithm we should ship. Maybe we need to have toggles for the power filter and the weighting. Each one has a moment when it reports the best results.

1 Like

Digging into the data a bit more, I wanted to understand why the rear left and front right of my bed have a larger range.

Rear Left Corner

There is a clear trend in the collected data. We don’t expect trends if the underlying system is static.

So I think that corner of the bed is shifting in Z due to the force applied by the load cell probe. The E3D Tool Changer bed is only supported by 1 rail and a large cantilever structure and its totally possible for the whole thing to to tilt under load. Its also possible something in the stackup with the bed is moving around. I’m comfortable calling this an issue with the printer.

Experiment: Try a lighter triggering force so the max force on the bed is in line with what the switch probe produces. (Prusa uses 40g, my setup could go as low as 10g but others may not have that luxury)

Front Right @ 2mm

In one probe the elbow is very different than the others and that’s responsible for the bulk of the range:

The higher Z value means the probe stopped early. Based on the force plot it doesn’t look like an electrical transient:

I cant make it go away with any math tricks. Discarding additional points doesn’t get rid of the spike. Not sure what happened there. Maybe a gnat :fly: got caught under the probe? I also don’t have any idea how to replicate an event that happened once in about 900 probes. This only showed up in the 2mm data and not the 5mm data.

Experiment: No idea, anyone have suggestions?

Conclusions

I think 0.003mm is an admirable result in conditions with good printer structure. Looking at those wobbles, they are 3-5 samples wide. I don’t think we are going to get down to 0.001 without fully understanding them. Also it seems likely my printer’s bed isn’t repeatable at 0.001 across the whole surface.

Experiment: Try different microstepping settings and see if the wobbles change frequency

I took a large collection of samples testing several different variables:

  • 2mm/s vs 5mm/s probing speed
    • A lower probing speed results in less overshoot after the trigger
  • 400SPS vs 2400SPS sample rate
    • Will a higher sample rate enable higher probing speeds?
  • 20g vs 50g triggering force
    • A lower triggering force results in less overshoot after the trigger
  • pullback move at 0.4mm/s vs 2.4mm/s
    • At the higher 2400SPS rate will we get the same repeatability at higher speeds?

Each variable was change independently and a full sweep of the bed conducted with 50 samples taken at each of 9 points. So that’s a total of 12 series and 5,400 individual sample probes (north of 300MB uncompressed). I worked up a way to effectively load this into Jupiter with multiprocessing so it can process everything in just a few seconds and start plotting charts.

The single best determinant of good performance is the speed of the pullback move. Slower is better regardless of the sample rate:

If i highlight just the fast pullback moves its hot garbage:

This would seem to point the underlying physical system getting noisier the faster we move it. That generally agrees with the results from other types of probes in 3D printers.

I don’t know what speed is “optimal”. This is something you could investigate for every printer. But generally you would need a sensor that’s able to keep up with 1 sample per micron at your sampling speed. I guess the good news here is we don’t need a super fast sample rate. If 1mm/s turned out to be “optimal” the sample rate would be 1kSPS. Maybe if you have fancy linear motors you’ll be able to push this much higher. But for the budget hardware we use in 99% of 3D printers its not going to matter.

The 20g triggering force didn’t perform better than 50g, in fact in some cases it was worse:

Its likely that the 2mm probing speed causes the largest reduction is overshoot. Both the 30g reduction in triggering force and the higher sample rate are less effective vs a 40% reduction in travel speed. Klipper’s trsync architecture is slower at bringing the machine to a halt than Prusa’s monolithic firmware. If you use CANBUS /multi-mcu homing it will be even worse.

Here is a zip file with the notebook and the data: load_cell_data.zip - Google Drive

I have a lot more questions an investigations that I would like to do but I’m not sure if they would be finding out things about load cells or about my personal printer. I’m going to focus on integrating the algorithm into the probing and homing code and pushing an update. Here is what I’m going to do:

  • The pullback move needs to ship. Its essential for keeping the probing cycles short while performing the sensing move at a slow enough speed. I may try and make it configurable, but it looks like all stepper motor based 3D printers would benefit from it and that’s 99.99% of klipper users.
  • Bad tap detection: I want to make this user configurable via macro or python plugin. Every printer is different and each is going to need different configuration, maybe even a different algorithm. I can do all the hard math and pass the results to the plugin. Prusa’s good/bad check is simple enough that it could be implemented in a macro, but the numbers they have used are prepared for specific printer/toolhead combination and likely cant be re-used on other printers. The default will be to assume all taps are good.
  • Bad tap action: I think this should just be a macro call. Again, different printers will have different means of clearing a nozzle of crud. There could be a wiper on a servo, a brush at a fixed location etc. Prusa’s “tap and wipe” should be doable in a macro but it has the downside of possibly damaging the print surface over time. The default action will be to do nothing.

Pushing these problems into user space should make it faster to wrap this up.

1 Like

Trying to isolate the performance of the sensor vs the performance of my machine’s z axis… I calculated the sample to sample variance. So instead of the range across 50 probes, we get the largest variance between any 2 successive probes. This means if the underlying system is moving due to the probing force this tends to cancel that out. Here are the two most encouraging candidates sets (the others are excluded so we can get get a sub-micron scale on the Y axis):

So at many points the worst sample-to-sample variance is less than 1 micron. Meaning there isn’t a lot of point in taking more than 1 reading with this sensor for an application like bed meshing. That’s really what I was hoping for!

Standout values on this variance metric also show cases where there may have been some kind of disturbance in the physical system. Here highlighting the faster pullback moves as they were all worse:

You can see the fat band of gray at the bottom that’s mostly under 2.5 microns. Your average is going to be better than this.

I’ve turned my attention towards good taps vs bad taps.

Prusa has 2 main types of tests in their code. One is a way of iteratively inspecting the sharpness of an elbow, which I think I understand. The other is an angle test, they are inspecting the angle between two lines in the tap. They expect that angel to be in a specified range that they calculated using a pile of data collection and a model.

But looking at the angles they have numbers that are all over the place. Usually when you calculate the angle between 2 intersecting lines you get a number between 0 and 90. But there code is comparing with numbers like -52 and 233!

Ther code is doing 2 strange things.

#1 It includes a includes a “normalization factor” of 250:

So they are dividing the slope of the lines with this before using the usual formula for angle between 2 lines: atan((slope2 - slope1) / (1 + (slope1 * slope2)). The result is that this function behaves very differently that the standard one.

#2 The return line makes the function order dependent, the same 2 lines in different orders result in different “angle” results. Normally you would take the absolute value of the angle, but here they add 180. So an angle of -1 becomes +179.

Anyone have any insights into what they are doing here? seen this before somewhere else? I loath blindly copying things I don’t fully understand.

I think I get it. They are trying to fix the angles coming out weird because the scale of the ‘graph’ is 1 gram = 1 second. What they are doing here is equating 1 gram to 0.004 seconds, re-scaling the x axis.

If a human, or any graphing software, plots this data it will stretch the x to fill the screen. We are looking at a few hundred ms on x and maybe 400g in y. So a 45 degree line on that graph is ~1 gram per millisecond. If you squashed the graph such that 1 gram = 1 second, the graph would show a lot more time, but the tap event would just be a small vertical line, making all of the angles close to vertical.

I still don’t understand why they added 180 instead of using abs()

I decided that I’d rather know if the rotation direction was clockwise or anti-clockwise between the vectors. So I’m going to use atan2 to calculate the rotation angle between the two vectors:

rotation_angle = math.degrees(math.atan2(vector_slope_1, 1) - math.atan2(vector_slope_2, 1))

This gives concrete meaning to positive and negative angles. Positive is a clockwise rotation and negative is anti-clockwise. Users (me!) might have to enter these numbers in a config and I think this form is easier to understand.

e.g. at the point where the probe breaks contact we expect a clockwise rotation. So we might say that an angle between +25 and +145 is valid. Or at the point when the homing move stops we expect an anti-clockwise rotation so -60 to -110 might be valid.

Using this I can perform basic validation of the tap. The angles in sequence should be: [clockwise, anti-clockwise, anti-clockwise, clockwise]. (or the inverse of that if the probe polarity is switched) If its doesn’t go in that sequence its not a ‘tap’ shape.

I’m also going to make the default scale 1gram / millisecond. (1K grams per Kg, 1K milliseconds per second, makes sense in Metric). That means the exact numbers in Prusa’s code will need some converting (convert time basis from 250 to 1000, convert from their angle formula to +/- rotations). I don’t know if they were going to be all that useful anyway as we will be running different physical machines and different loadcell designs.

How to summarize this… :thinking:

Investigation

This python notebook is a deep dive into Prusa’s use of Butterworth Bandpass filters in their endstop implementation. I’ve established that mkfilter, Matlab and SciPy all work the same and can generate the same filter coefficient as seen in their code. They also filter data in the same way, the outputs from the mkfilter code and SciPy’s lfilter are identical.

Prusa couldn’t come up with 1 filter design to run all their printers. They have 3, one for the MK4, from for the XL when probing in Z and one for the XL when probing in X & Y.

Their filters are more narrow than I expected:

Printer Low Frequency Cutoff High Frequency Cutoff
MK4 0.8Hz 12Hz
XL 11.2Hz 17.6Hz

As a plot of frequency response:

I understand the reason for the low frequency cutoff, that’s about rejecting drift. But I don’t understand the reasoning behind such a narrow passband.

When applied to a tap I see 3 potential problems with these filters:

  1. The lag. The filtered lines are shifted to the right from the real force plot. This means the real force is significantly higher than the reported filtered force when they trigger. So 40g isn’t really 40g, its more like 200g.

  2. The XL graph doesn’t even break the 40g trigger threshold (red horizontal lines).

  3. Lastly all that wiggling at the start of the graph is the filter settling. That’s a long time to wait.

I cant pick a single filter design for all printers, I want faster triggering and I want that settling time gone.

So after much experimentation I have this:

  1. The “Proposed Filter” is a highpass filter with the same low frequency cutoff as the corresponding Prusa filter, combined with a notch filter at 60Hz. Using a highpass filter keeps all the high frequency signal components and this results in the filters tracking very closely with the blue/true force line.

  2. The filters are immediately flat. Using sosfilt_zi, the filter steady state initialization vector is calculated and then multiplied it by the first sample. This results in no settling time.

All of these filters perform the same at rejecting low frequency drift in the test data:


That is “continuous taring”. The proposed filters are just a little bit more noisy than Prusa’s, but still less than the raw force line. I very much like that tradeoff.

Implementation

I settled on the Second Order Sections (sos) filter format. Its not quite as memory compact as the b, a format but the code to execute it is very simple (right out of the SciPy source code) and its more resistant to 32 bit floating point errors:

def filter_step(self, x: float) -> float:
    x_cur: float = x
    for section in range(len(self.sos)):
        x_new = self.sos[section][0] * x_cur + self.zi[section][0]
        self.zi[section][0] = (self.sos[section][1]) * x_cur
                              - (self.sos[section][4]) * x_new
                              + (self.zi[section][1])
        self.zi[section][1] = (self.sos[section][2]) * x_cur
                              - (self.sos[section][5]) * x_new
        x_cur = x_new
    return x_cur

Translating that to pure c was not too difficult.

I know that this is a lot of math and theory. I’ve committed a filter_workbench notebook to the branch that would be part of a PR. Hopefully this helps printer designers work out a good filter for their printer.

I’d like to offer regular users an “easy mode” config to express filters like this:

[load_cell_probe]
    endstop_filter: {
              'highpass': 0.8,
              'notch': 60.0
       }

This would allow a lot of flexibility and readability. But using this config would require SciPy be installed to do all the math to convert this into a filter and initialization vector.

SciPy can be installed just like NumPy:

# for python 2.7
sudo apt-get install python-scipy
# for python 3
sudo apt-get install python3-scipy

For manufacturers and printer designers that need more control, I could take in the raw filter design:

[load_cell_probe]
    raw_endstop_filter: {
        'sos': [[ 0.92487988,  1.84975976,  0.92487988,  1.,  1.86554535,  0.87403728]
                [ 1.,         -2.,          1.,          1., -1.97845941,  0.97868912]
                [ 0.60876834, -0.50578302,  0.60876834,  1., -0.50578302,  0.21753668]],
        'zi': [[ 0.06440686  0.06020638]
               [-0.98928674  0.98928674]
               [ 0.          0.        ]]
       }

This is very flexible and avoids a dependency on SciPy, but its a potential footgun because who knows what that block of numbers does? (You can find out via SciPy sosfreqz, but who will do that?) Users blindly copy config files they don’t understand and that scares me.

I’d like to hear what you all think of this idea.

Installing SciPy with pip in the klippy-env is annoying. You have to install a fortran compiler first:

sudo apt-get install gfortran

Then install SciPy

~/klippy-env/bin/pip install scipy

And that take at least 30 minutes to run.

Is there not way to just reference the global package in the virtual env without having to re-compile everything?

I’m still very disinclined to start copying large swaths of code out of another open source project just so klipper users don’t have to install a library. I don’t think we should be responsible for maintaining forked code like that (especially when some of it is in c).

For what it is worth, I’m a little confused about the problem that these filters are trying to solve.

If I understand it correctly, you want to define an mcu side check on the ADC values from the load cell so that the mcu can rapidly stop the steppers (via a trigger signal to the trsync code). This is desired so that reasonable speeds can be used when homing the Z axis, so that homing is still reasonably fast if the toolhead is far away from the bed.

Once this initial signal is detected (and the Z motors are halted), then the high-level python code can proceed with a more intense “tap detection” to determine a precise Z position.

Did I understand the above correctly?

If so, why are sophisticated filters necessary? Is the load sensor really drifting so much in a few seconds that one can’t distinguish between a kilogram of force? If so, I’m curious how such load cells could be used for anything…

I guess I must be missing something.
-Kevin

You have it right. The goal is to keep the threshold triggering code on the MCU close to where the signals are generated so the trsync can be triggered with minimal latency. Python does the post collision analysis of the tap.

There is a lot of drift, 100-300g for a full z travel, but its not a fault of the sensor. Its the printer. The toolhead is being torqued by external forces, bowden tubes, and cable chains.

Real example: on the test rig I had QGL fail (trigger too soon) repeatedly with a threshold of 40g and just 10mm of travel @ 5mm/s. The noise is something like +/-5g so 40g is plenty of threshold. With no bowden tube it worked fine.

If I set the threshold at 300g, to cover the worst case, that would be a lot for force if the travel is short. But if the travel is long the amount of force in the tap might be very small. This is bad for consistency. We want to get a consistent tap “depth” regardless of the travel move length.

I was not familiar with digital filters when I started working on this. But now that I have learned, I think its the right tool for the job. Its not smoothing, like a moving average, and its not a heuristic. Its able to drop out the drift signal and leave the collision signal behind. That’s actually what we are trying to do.