HX711 support (load cell sensor)

Support for the HX711 load cell sensor chip has been requested a number of times already. I have written a preliminary module which is able to read out data from the HX711. It works without modifying the MCU code, but there is a catch: It does not obey the specifications in the HX711 data sheet.

The HX711 has a unidirectional SPI-like interface with a clock and a data bin. The host is expected to read 24 bits after the HX711 has send a data ready signal by pulling the data pin to low. After the transfer, it should send 1-3 additional clock signals to configure the gain for the next conversion.

My module simply reads 32 bits periodically (currently with ~10 Hz, as my HX711 module is configured to that frequency apparently). The additional clock signals seem to be ignored (although the data sheet explicitly mentions to send no more than 27 clock pulses per conversion interval). The gain will then presumably fixed at 64 (lowest setting).

More problematic is actually ignoring the data ready signal. With a logic analyzer I have found that the HX711 actually revokes the data ready signal if the data has not been read out before the end of the conversion interval. If the host attempts to read the data during that time it probably will read simply -1 (all bits high).

Another issue is that the SPI implementation in Klipper seems to expect always all 4 pins to be configured.

Anyway, this is a first starting point I wanted to share with the community. Maybe some people have ideas how to improve this without the need for special microcontroller code (which in principle would be fine for me, but I think it would be nice to avoid it).

So my first goal to improve this would be to read out the data ready signal. Is it possible to read out a pin which is configured as a soft-SPI MISO pin in addition as a plain GPIO input pin? If yes, is it realistic to poll this pin with a frequency (much) bigger than 80 Hz (the fastest possible HX711 read out frequency) and react to it in time with initiating a transfer?

In the mean time, I got some ideas on my own how to improve that, at least without writing HX711-specific microcontroller code. This would require some new features in the microcontroller and communication protocol though:

  • Scheduled SPI communication, i.e. the ability to control at which time an SPI read shall take place. Ideally, several reads can be queued at the same time to achieve 80 Hz read rate.
  • Soft-SPI with arbitrary bit width, i.e. transfer e.g. 25 bits (rather than 24 or 32).
  • Pin change interrupt recorder with dead time, i.e. allow to enable the pin change interrupt for a specified pin and transport the time stamp when such interrupt arrives to the Python code. A configurable dead time should prevent the interrupt to be recorded again within the dead time period after it has been recorded already (this limits the rate).
  • Allow duplicate assignment of pins in certain situations, specifically the MISO pin of a soft-SPI must be used for the pin change interrupt recorder at the same time (the dead time will prevent recording the SPI data as pin changes). I guess, the pin change interrupt recorder should simply be usable for all inputs, even if already taken by some other driver.
  • Allow soft SPI without specifying CS and MOSI pins.

The idea is that the HX711 operates at a fixed conversion interval (either 10 Hz or 80 Hz, selectable via hardware pin - usually this is done as a soldering option for off-the-shelf HX711 modules). The exact interval merely needs to be synchronised to the Klipper main clock. If that is achieved, the transfers can simply be scheduled at the right time. The synchronisation of the interval can be done via the pin change interrupt. We have to observe at which time the data pin is pulled to low (outside an SPI transfer). The transfer should be scheduled shortly after that time. Since the interval is roughly known, it should be enough to adapt the phase of the newly scheduled transfers whenever the pin change interrupt is seen by the Python code.

I think these additional features might be useful for other things as well. I think I can implement this within a few weeks or so, but I would like to hear an opinion first, if it sounds reasonable. I could also implement it as a special driver in the microcontroller (just like it is done for the ADXL345), but this would be special code which cannot help other projects.

Scheduled transfers would also help the load cell-based probe algorithm to speed up the measurements. My current hardware does not use a HX711 there, instead I have an I2C-attached ADC. Hence I might try to implement scheduled I2C transfers along with the SPI.

FWIW, I suspect a simple sensor_hx711.c module will be easier to implement and maintain than an “arbitrary pseudo-spi interface”. My experience has been that all of these “quriky spi” devices are sufficiently different from the next that no appreciable code reuse is possible.

It’s already possible to use SPI without a CS pin (both soft-spi and regular hardware SPI). There’s been a few requests of being able to implement SPI without a MOSI or MISO pin (the former for MAX6675 temperature sensor and the latter for UC1701 and similar displays). The suggestion in the past was to introduce a low-level DUMMY mcu gpio pin that ignores updates. No one has gotten around to implementing this though. I suspect most people just designate an otherwise unused gpio pin to be used as a dummy pin.

-Kevin

Ok, I will aim in that direction then. Surly it will be easier to implement, I just wasn’t sure if more chip-specific code in the microcontroller code would be welcome…

HI, where is the code of your implementation?

I’m considering doing something related but RPI2040-specific, where the actual HX711 interface code is implemented by the 2040 state machine - which would be very fast. The code for that already exists.
It would only run on the 2040, but then, these things are cheap.

WDYT?

I did not yet write an MCU-based implementation. Someone else anyway has started this already, see this pull request:

I have improved my temporary (!) Python-side implementation though, and it seems to be pretty reliable right now. It uses out-of-spec tricks though, so maybe different batches of the chip might work differently. Also it is limited to maximum gain (because the standard SPI implementation cannot do 25, 26 or 27 bits - I just always read 32 which is already out of spec). To synchronise with the sampling interval of the chip, I rely on receiving 0 if readout is started before the new sample is available. This is also out-of-spec, and I do not know if it works under all conditions, but so far I have seen no issues. This of course means the value 0 can never be seen, but the readout at maximum gain is anyway so noisy that this usually does not matter much :wink: Finally, the code is probably only working with the 10 Hz configuration. That code can be found here:

I’m considering doing something related but RPI2040-specific, where the actual HX711 interface code is implemented by the 2040 state machine - which would be very fast.

I don’t think it is worth using the RP2040 state machine for this purpose. Speed is not an issue. We have at most 80 SPS. The serial interface is synchronous, so bit banging is no problem. People may want to connect the HX711 to a normal 3D printer board directly. Adding another microcontroller just to read out the HX711 is IMO not nice. Finally I see potential to do better things with the state machine in a 3D printer (someone ever thought to write a special peripheral to generate the motor steps super precisely and fast?)… :wink:

Hmmm ok, but notice that in order to mount the bed on load cells, I’d need to query 3 HX711s

This is still not a big deal. Especially if you anyway connect them to a dedicated MCU, you can probably read out 100s of HX711 without problems on a modern microcontroller. The RP2040 has only 2 programmable IOs, so you can’t even query 3 HX711 with that (unless you want to implement support for multiple HX711 in a single PIO).

Besides: You can connect load cells in parallel to the same HX711, if you are only interested in the average force. I also have two load cells connected in parallel.

I’m looking at implementing this today and I think I finally understand it.

References:

The issue with bit bang is timing. If the clock pin is set high for more than 60uS the chip will reset itself. If you have seen “odd behavior” with this chip its probably due to undsired chip resets. When the chip clock is high we need exact timing of at least 200ns and the interrupts need to be turned off. When the chip clock is low its OK the service interrupts. I don’t think any of the implementations do this correctly, other than Prusa’s. This PR is close but its doing an irq_poll(); in its wait function which checks interrupts.

There is one thing in the codebase that works like this already: Neopixels. I’ll will work from the neopixel timing code to get this right.

What really has me scratching my head now is how to organize my code? I have both SPI and bit bang based chips that read load cell data. These are supposed to be optionally compiled based on platform capability. I assume I’ll need to move each sensor chip into its own module so it can be conditionally compiled. Then have a load_cell module that depends on them all but uses the CONFIG_HAVE_GPIO_SPI and CONFIG_HAVE_GPIO_BITBANGING flags to access the correct modules conditionally.

My goal is to keep the sensor’s code very small with no timer reading loops. Doing the read loop in one place keeps the timing tight with the endstop. It also reduces the complexity of supporting a new ADC chip.

I found a Prusa MK4 specific wrinkle: Prusa uses the ‘A’ channel on the ADC to read the load cell. They also use the ‘B’ channel to read the hot end temp with a PT100.

So when its reading the lead cell it cant be reading the hot end temp, its an input multiplexer. When the machine is probing they have a “high precision” mode that locks the ADC to channel A. Otherwise they pull a temp sample from channel B every 13 samples:

    if (!(loadcell.IsHighPrecisionEnabled()) && sample_counter % 13 == 0) {
        next_channel = hx717.CHANNEL_B_GAIN_8;
    } else {
        next_channel = hx717.CHANNEL_A_GAIN_128;
    }

The channel that will be read is selected by adding a specific number of pulses to the previous sample. That’s what next_channel is about, they have to know whats going to be needed on the next read. So the previous channel value is what you get a sample for, not the newly selected channel.

All of this feels like it flies in the face of klipper’s command and control architecture.

IDK how I want to handle this. I’m certain that 1 process in C needs to handle the actual reads so the timing can be tight and consistent. No Python should ever be able to directly read from the sensor or it would become a thread safety issue. Both the heater and the load cell have to accept missing samples. The hot end in particular needs to be OK with missing up to a few seconds of data. Maybe it needs a notification of “high precision mode” so its power output is capped.

@koconnor sorry to bother you. I don’t think there is a clear precedent in the code base for how to handle this kind of resource sharing, I’d like some guidance please.

Maybe it is not necessary to reproduce this behaviour exactly in that way, at least at first. You could e.g. use every second sample from the other channel, so you have basically 40 SPS per channel (configurable via a switch in the driver/configuration, resulting effectively in two ADC interfaces towards the other Klipper components). This would probably require to reduce the probing speed, but otherwise the functionality could be identical. Once that works I would consider adding some switch for the “high precision” mode.

In any case, this needs to be done with extreme care, since you can only stop reading the hot end temperature when the heater is off. Otherwise it might get dangerous (even fire hazard). This is easy if you write a specific firmware for that printer. Klipper is generic, so the HX711 driver should not need to care about the state of a hotend heater.

I am not saying it cannot be done, but I strongly recommend to cross that bridge only when you are there. To my knowledge we don’t even have a well working driver for the HX711 that works without such wrinkles (honestly, I generally consider such things bad engineering).

As much as I want to give Prusa the side eye for this ‘value engineering’ choice…

suspicious

I know we have people in the community that want to take their toolhead and mount it on machines that run Klipper. I feel responsible at least considering this use-case. I also wouldn’t be shocked if we get a bunch of Chinese clones of that toolhead floating around over the next 12 months.

I also want to agree with Kevin’s point here:

But I’d hate to see big blocks of c code coming into the code base just to run 1 toolhead. I want to keep the un-reusable code to a minimum. The sensor’s read() function is the key artifact. In my branch I have those split out into different modules: hx71x.c and ads1263.c.

But most of the code that people have been wrapping around that read() method in their submissions is reusable:

  • Start/Stop reading the ADC
  • Calling a ADC at a regular Hz
  • Sending ADC counts data back to the Host
  • Sending error information back to the host
  • Keeping track of sample sequence number and sample time

That’s what ‘load_cell.c’ does now. But really its no longer a sensor_load_cell, its really just an ADC wrapper. Lets imagine adding operating the multiplexer to that list of reusable features. Maybe we call it an adc_multiplexer.

We could send in a byte array when sampling starts that indicates which channels should be read at what cadence. Each byte indicates the number of samples to take in a row from that input index.

If you want to just sample continuously from input #1, you send this:

uint8_t sample_plan[4] = { 0x01, 0x00, 0x00, 0x00 };

If you want to sample the way Prusa is doing it, 12 samples from input 1 and 1 sample from input 4, you send this:

uint8_t sample_plan[4] = { 0c0C, 0x00, 0x00, 0x01};

I can pass the current_channel and next_channel values to the read() method so the sensor doesn’t have to keep track of that info. Iterating that array isn’t a complex algorithm.

There are some additional issues:

  • Most sensors have a settling time after switching inputs. The HX717 says the settling time is (4/320hz) ms. Its not clear if the sensor withholds the next sample for this additional time or not. The ADS1263 also has channel switching settling time. I think this can be handled with a constant configurable time delay after a channel switch in the adc_multiplexer.
  • Different gain values may be needed needed per input. The HX71x series combines the gain and the channel setting but other sensors (ADS1263) need them separated. So we might need an additional gain per input array as configuration.
  • If the ADC has 4 inputs can it drive 3 load_cell_endstops and a hot end PT100? In theory you could build a printer like that, seems like the ‘klipper’ thing to do to support whats possible.

Well, if you want my 2 cents, I’d echo @mhier’s comments. Prioritize keeping the API simple and keeping the code simple.

Cheers,
-Kevin

2 Likes

Ok. Let me get some of the python code cleaned up first. Then I can do a commit with this idea and we can look at how much of a difference it makes, like on a separate branch.

We would need a custom Heater object that shut itself off if it didn’t get a reading for some period of time.

I’m now about 99.999% sure my version is correct: sensor_hx71x.c

$ TARE_LOAD_CELL
// Load Cell tare weight value: 14857827
$ CALIBRATE_LOAD_CELL GRAMS=193
// Load Cell Calibrates. counts per gram: 71981, max weight: 29834g
$ READ_LOAD_CELL
// Load Cell reading: raw average: 13774877, weight: 191.368236g,
 min: 189.489990g, max: 193.018033g, noise: +/-1.764021g,
 standard deviation: 0.975799169225g, trigger weight: 4.87899584613g,
 samples: 1000

Data coming back to the HX711 load_cell:

4 Likes

In my day job, I would intervene at this point because this creates too many dependencies between things that should have nothing in common. We (at my job) are too often in the unfortunate situation that such bad hardware decisions force us to some compromise at the software side. Experience has shown that it is very important to use every possibility (sometimes including those with acceptable impact on functionality) to keep the implementation “clean”, i.e. separate logically-independent parts from each other, even if the hardware realisation seems to dictate something else. Otherwise this might (will?) become a maintenance hell later.

Hence I strongly recommend to try the simplest possible solution first. This seems to me to add support to the HX711 driver for reading out the two ports separately, such that the interface looks like two mostly independent ADCs to the rest of Klipper (other multiplexed ADCs are already supported, since typically the built-in ADCs of the MCUs are multiplexed, so you are often forced to use the same conversion rates on different channels etc.).

That brings me to one obstacle I have “hit” a while ago when trying to get my code reviewed in a pull request: Klipper has not yet a well-defined interface for ADCs. Back then there was some discussion whether such interface is wanted. @koconnor was kind of against it, as he argued that the differential (= signed) ADCs used for load-cell measurements etc. would be fundamentally different from the to-date exclusively supported single-ended (= unsigned) ADCs used e.g. for temperature measurements. This example shows that such distinction is not helpful at all. Hence I believe the first step should be to develop a generic interface within Klipper for reading/receiving data from ADCs. This interface should be defined both on the Python-side as well as on the C-side. “Develop” in this context means agreeing on the interface and maybe refactoring the existing code to follow it. IMO this has to come before attempting to integrate anything new into Klipper.

Does this sound reasonable for everyone?

Yes to interfaces! I will endeavor to evolve my code towards that ideal. Particularly my Python right now far from that and I have a task to split the load cell and probe functionality completely away from the ADC functionality.

But:

At work we call this the “Big Bang Refactor” and when you predicate the success of your project on this you usually fail to ship. Over the last decade I have become very comfortable with the idea that 2 ways to do something is not bad. I think there are several good reasons why these sensors are different from the existing ones and can be a different interface. Things like:

  • The custom c implementations for reads
  • Read methods that return no results (or duplicate results that are useless!)
  • The need to sample continuously at high frequency (400Hz)
  • The need for high timing fidelity (low jitter between sample times)
  • The need to send data to the load_cell_endstop on the MCU
  • Multiplexed inputs to a single delta-sigma sensor with only 1 output

I will try to ship my code as something that can be easily extended to new sensors that have these requirements, but I’m staying away from altering the existing implementation.

Instead, lets think about how we might make an adapter from this new thing to the existing ADC system. Maybe a cached temp value that goes to None when the sensor stops sending data on that channel? I’m sure we can come up with a way to do it that’s not not too disruptive.

1 Like

(there is a chance there is no PT100 in the shipping boards. Its defined in the code but I cant find a reference to that feature online. We wont know for sure until the board schematics are released or someone takes a multi-meter to it)

I understand and kind of agree, but I have also made the opposite experience. Bad APIs pretty much always cause trouble later, and if corrected too late the “big bang” will get bigger and bigger up to the point when you can only throw away and rewrite. I guess, the middle ground should be found :wink:

I agree, but only if there is a legitimate reason for this. Doing the same thing in multiple ways (typically it does not stay at 2) in the same project just because nobody cared enough to identify commonalities is a huge pain in the ass. Code is harder to understand and hence to maintain, new features are sometimes a nightmare to implement if the are in the unlucky situation that they have to deal with multiple different of these ways simultaneously. I think, we right now have identified exactly such situation (using a HX71x ADC for both load cell and temperature measurements).

I believe it should not be even very difficult to find an interface which works for almost all use cases. None of the things you list there are really contradicting in any way each other. I can already derive one requirement from all points: We should use a “push-type” data transfers (at least optionally), i.e. the ADC driver should decide when samples are passed on to the algorithms for further processing (in contrast to the algorithms calling some read function).

For algorithms which don’t care about timing so much, there could be a function to return the latest sample, which would be enough e.g. for temperature controllers. If nevertheless the time stamp is available, a temperature controller could even do some check to detect if the ADC is not shipping data any more (because the ADC or multiplexer has been configured differently) to switch off the heater. This would be perfectly abstracted then from any details of the ADC chip.

For the HX71x, this is anyway the only clean way to implement the driver, since the hardware decides when the conversion is ready.

1 Like

FWIW, I’m still unsure of trying to merge in “general purpose adc code” into the existing “temperature adc” code. The temperature code has a few quirks (eg, the temperature conversion and PID updates are done in a background thread in the host software) and it isn’t really scalable to bulk measurements.

If it came to refactoring, I’d probably first look at the current bulk sensor code - the sensor_adxl345.c, sensor_angle.c, and sensor_mpu9250.c have a bit of overlap (in both mcu and host).

Refactoring aside, the load cell sensors seem to have more overlap with the “bulk sensors” code than they do with the “temperature adc” code.

Cheers,
-Kevin

I don’t think this excludes with a common ADC interface. I call it ADC interface, not sensor interface, because it would be only about reading raw ADC values. Every other aspect like converting it into physical units is of course a different story.

If you think about the example of a temperature sensor connected to a HX71x, how would you solve this without a common interface?

I have not yet looked into the code so deeply. In any case, in my experience such big refactoring is best not done in a single step. We could first agree on a general purpose ADC interface and start implementing new stuff based on that interface. Then we could refactor existing code step by step, maybe even only on an as-needed basis.