MPU9255 issues on Raspberry Pi 4

I’m getting hit and miss readings from MPU9255 accelerometer (ACCELEROMETER_QUERY returns incorrect readings, MEASURE_AXES_NOISE returns extremely large noise readings).

Sometimes i get good readings, sometimes xyz swapped around in a cycle, sometimes nonsense readings, using latest master as of yesterday (April 29) .

I tracked the issue down to the accelerometer’s FIFO overrun, as detected in sensor_mpu9250.c line 209

    // Detect if a FIFO overrun occured
    uint8_t int_reg[] = {AR_INT_STATUS};
    uint8_t int_msg;
    i2c_read(mp->i2c->i2c_config, sizeof(int_reg), int_reg, sizeof(int_msg),
                &int_msg);
    if (int_msg & FIFO_OVERFLOW_INT)
        mp->limit_count++;

After testing it a bunch (~20 times), it seems that if nonzero limit_count is reported to klippy, the values are garbage. Any time a zero limit count is reported to clippy, the values seem OK.

The issue also seem to “resolve” when there is a background process using up 100% of the CPU time (such as C/C++ tools when developing remotely with visual studio). Which seems weird, if anything I would expect that background CPU load would make the problem worse.

I tried lowering rest ticks , more and more but even

rest_ticks = self.mcu.seconds_to_clock(.1 / self.data_rate)

(instead of 4. / )

did not reliably resolve the issue.

I can attach the logs but they are probably not very helpful (and also I added a bunch of my own logging to get the value of reported limit_count and to dump raw bytes).

Good morning! I rewrote the MPU code recently so I am really hoping I did not mess anything up in doing so.

In general if the FIFO has overrun at some point then it sets that limit_count variable, which is what you are seeing. This, of course, should not happen during normal use and will cause corruption - the X, Y and Z axes are six bytes of data, so when a few bytes are lost corruption will happen with weird numbers etc.

Could you let me know what you are using as an I2C host, share a photo of your setup/wiring, and share the logs so I can see your configuration and if there is anything else on the bus or interfering? Also have you been able to measure continuity and pull-up resistances of your bus?

Fingers crossed that you are running I2C at 100kHz by accident as 400kHz is necessary to keep up. On that note, can you share you /boot/config.txt as you seem to be running on a Pi.

Hi,

Sure, I am indeed running a Pi 4b as the host (“Raspberry Pi 4 Model B Rev 1.4”), wired to the board. I tried with both long wires and short wires, that did not seem to make any difference at all (right now back to long wires). Here’s what the short wires look like:

(I ran with those wires straight to the board earlier, not just through the extender). Wired in the recommended 3 pairs way.

My understanding is also that there’s no automatic re-transmit or other error correction in i2c so it can’t really slow down due to wiring?

What does replicably make the difference is that with a background process hogging up 100% of CPU on the Pi, it works better.

Regarding your changes I originally tried the version before your changes and it wouldn’t work at all (complained about clock in the past or something like that)

Attached config.txt, I have the line

dtparam=i2c_arm=on,i2c_arm_baudrate=400000

config.txt (2.1 KB)

uname -r reports “6.1.21-v8+” and it is a 64-bit OS.

The pullup resistors on the board itself are 10K, not sure what’s the safest way to test the ones on the Pi .

I tried a few other things, since the board haven’t got level shifters I tried powering it from 5v instead (just in case regulator drop out is somehow a problem), didn’t make any difference.

I was thinking next step in troubleshooting, I would report the amount of time spent in sending vs receiving, to see if it’s data getting received too slow, sent too slow, or perhaps wake ups being too late.

The weird thing to me is that it improves when there’s background CPU load. Which makes me think that it is a wakeups issue. Or something with the CPU governor.

By the way is there any way I can test this without having to power on the whole printer? Does it work if i make a config file with just the accel tester? (I’ll try on the weekend)

A few followups:

If the I2C distance is from the Pi to the device, how long is “short” and “long”?

Are there no other I2C devices on the bus?

Pullups:
Research suggests that the Pi4 (despite its schematic not showing them) has 1.8K pullups. To confirm this can you measure GND - SDA and GND to SCL? If the printer and Pi powered off and wired to the MPU, you should be able to measure without risk.

I am also running a 64bit kernel, is your kernel custom or supplied by the RPi foundation? (this is a longshot)

Notes:
I2C speed setting seems just fine.
On testing, sadly the printer has to be on (AFAIK), Klipper is designed to look for a printer controller and will error-out without it and the controller is typically wired to the printer PSU.

The CPU loading issue is similar to some problems I had in trying to make Linux behave properly. I do not currently know why it is being a pain about waking up in enough time to work properly.

Additionally, reducing the multiplier rest_ticks = self.mcu.seconds_to_clock(.1 / self.data_rate), should cause the Linux MCU to wake more rapidly preventing the overflow. Sigh, I don’t have a Pi 4 to test, and I don’t think Klipper has a loan program for me to try with.

Hi,

Thanks for the pointers.

The short is 7cm, the long is total 90cm (83cm extension for the 7cm). Couldn’t see any difference using either. Checked the resistance 3.3v to scl and sda, got 1.66 k which matches having a 2k pullup on the Pi and 10k pullup on the device.

I think its rpi foundation kernel, judging by uname string, I hadn’t really changed anything it’s just vanilla Debian install. The only unusual thing is I have it on a USB 3 drive rather than on sd card.

I’m going to look at it when I have opportunity, I’ll try reporting total time, sum of transmit durations, and sum of receive durations.

The more I think about it the more I think it working well when there’s 100% CPU load (from another process, that is) got to be the clue.

Some sort of power saving feature preventing it from working correctly. I dunno if there’s some core hopping happening as well, although that seems like it shouldn’t be significant enough of an issue. The sample rate is 4kHz so the data rate should be 192000 kbit/s . So there is some room but it’s cutting it close enough.

How does the fake serial port work, by the way? edit: i.e. it is certainly non-blocking?

A quick test with an MPU on a RPi Pico gave exactly the results I expect from this printer. Noticed nothing strange

2 Likes

Yeah this sounds like a full raspberry pi related issue, with all the non-realtimeness of Linux thrown in, it can’t keep up for some reason. Except when there’s a background process hogging up at least one core. Then it works fine.

I bet this is due to aggressive power saving in the CPU governor. The 100% CPU is preventing these low power states and therefore the timing in the MCU is being kept.

Rebuilding the Linux MCU process to use the guaranteed scheduling system - SCHED_DEADLINE - Wikipedia is probably the right “silver bullet” for this problem. The only issue is that the current Linux process and its sleeping strategy is the wrong gun for that bullet. SCHED_DEADLINE is the only scheduler which actually guarantees a %age of CPU within a selectable period of time.

Perhaps it would be worth giving up sleeping while polling the MPU, which seems wasteful but would probably be way more deterministic?

I’m surely no developer, but:

  • I have not seen any reports (except this) that the MPU IMUs would be causing an issue on the RPi
  • I tested the original code on a RPi 3B as well

So if your changes did not affect the CPU scheduling, why should it be the case now?

1 Like

Just to add to this:
Before we are now barking up the wrong tree: Maybe it makes sense for @Dmytry to checkout a commit before the code change and see if the error reproduces. Could be a faulty chip or some other issue in the end.

MPU9255 issues on Raspberry Pi 4 - #3 by Dmytry@Dmytry did mention in this one that he checked out a previous version but hit the “Timer too close” bug my changes were supposed to address.

Certainly could be a faulty chip of some kind, the MPU’s have been known to have refurbished chips etc. causing issues. @Dmytry , do you happen to have an alternate MPU to try?

I agree on not rushing in to blame the powersaving, however each embedded system has some manufacturer specific behavior, especially around the CPU governor.

@Dmytry , does the reliability situation change if you force the performance governor instead of the typical ondemand?

 sudo apt install cpufrequtils
 sudo cpufreq-set -g performance

[Raspberry Pi Documentation - Raspberry Pi hardware]

2 Likes

I think that may have done it! Thanks a lot!

MEASURE_AXES_NOISE works and my exceedingly verbose logging seems to show 100% perfect packets (I have the accelerometer placed at an angle so that only the x axis is horizontal so all the packets start with a zero)

Gonna test some more. I did change “if (mp->fifo_pkts_bytes >= data_space)” to a “while” earlier, also, which wasn’t helpful at all and forgot to change it back before doing this (successful) test with performance governor. edit: It’s probably better with an “if” anyway, i assume the reason it’s not a while is timing related errors & shutdowns?

Gonna make it set that on boot, seems like just all around better setting to have on a printer, just in case. Not like the pi is a significant power consumption vs all else that’s going on.

Let me know how it goes, and if there are any other gotchas. I wonder if there is a way to change the CPU governor while the MPU measurement is happening?

It is probably the Pi 4’s more advanced DVFS and fine grained powersaving introducing a time lag or latency before running tasks that lies at the root of this. That and the Linux MCU process does not currently tell Linux what it needs.

Basically the Linux MCU process completes a task and tells the Linux scheduler “wake me up if serial traffic comes in from Klippy, or after at least 0.X seconds have passed so I can run my next task”. However, like most non-driver ‘normal’ programs on any OS, the normal sleep() type commands available are a minimum, not a guarantee, e.g. sleep(10) will sleep for at least 10s, but no guarantees it will be before 11s, 12s is just fine. As I said above, “after at least 0.X seconds have passed”.

With the ondemand strategy the Linux scheduler optimizes for power while managing the task list. So it is taking opportunities to underclock and shut down as much of the SoC as it can, including cores etc. while keeping the tasks moving - effectively periodically it ‘overclocks’ itself by rapidly increasing the voltage, upclocking, enabling cores, and then run tasks in parallel for as short time as it can. It then turns off everything. It is constantly ‘overclocking’ and ‘underclocking’ itself all the time, almost ondemand you could say :slight_smile: .

At the moment the Linux MCU process has not told the scheduler “I MUST have an opportunity to execute for 0.Y seconds every 0.X seconds” (SCHED_DEADLINE). Instead it has told Linux “when you consider which task to run, give this program priority over all normal operations” (SCHED_FIFO). So the scheduler thinks it is doing a fine job to just wait that bit longer to save more power and is failing to keep up with the MPU.

It is not simple to program for SCHED_DEADLINE as it will require some re-architecting of how the Linux MCU process works, and at the moment the code is almost identical to what runs on a controller board.

@Dmytry - I have been reading more, can you try the schedutil governor, and see if it fixes the issue? In theory this governor is integrated with the scheduler itself (vs. ondemand or performance), so should be sensitive to when tasks want to execute.

It should save more power than ondemand with better latency. I will have to try it on my Pi3, and perhaps recommend it for Klipper if its claims are true.

Tried testing a bit, schedutil doesn’t seem to work reliably - I can get readings most of the time if I just check noise or do one query, but if I try to measure resonances I see that the data was corrupted.

I get occasional timer too close crash with either.

In unrelated to the sensor itself - is there any interest in having sensible units in resonance measurements?

Right now they are not related to the excitation signal in any way, i.e. the values are bigger if you set greater accel_per_hz , etc. For input shaping I think you would want the transmissibility to be as close to 1 as possible, rather than as low as possible. It’s probably fine for typical machines but not very usable for machines incorporating some level of mechanical damping that is significant in relation to the frequency.

Thanks for testing the scheduler. I guess I should not have held out too much hope, especially as we never actually tell Linux what is needed.

I should do some proper deep profiling before trying any changes, including kernel syscalls to see what it is really up to, and where it is spending its time. If I can trace it often enough I might be able to get a trace going to see its sleeping/waking behavior as well.

As for the units, that is a non-trivial question I have not looked into. Starting from the perspective that we are sampling acceleration with low cost non-calibrated sensors, this might be most similar to a Power Spectral Density Spectral density - Wikipedia

The mpu9255 doesn’t drop whole samples on an overflow? That’s certainly odd. I understand that samples would get lost, but it is strange that the x, y, and z data would get intermingled.

-Kevin

I’ve observed the problem of corruption, however I have not spent a lot of time analyzing it. It is likely I described it inaccurately. Before the patches to improve the I2C bus use it would happen with the accelerometer query commands fairly often. So what follows is better than a guess but hardly full knowledge.

Known Facts:

  1. The MPU FIFO size is a power of two, the smallest is 512, but different models have larger, e.g. 1024.
  2. The FIFO has two user selectable modes upon overrun:
    1. Overwrite oldest existing data [<- Klipper sets this, or perhaps fails to choose the alternative?]
    2. Discard latest reading
  3. The accelerometer data for X, Y and Z is six bytes, two bytes per axis, with no marker as to the start and end of any group.

Informed guess:

The FIFO is a ring buffer, and a whole number of six byte packets cannot fit into it. When a FIFO overrun occurs the overwriting appears as transposition of some of the axes, effectively swapping the order of the values from the X, Y and Z axes.

Okay thanks.

I wonder if “discarding the latest reading” may be a bit more “sane” then.

-Kevin

What I don’t understand though is how the readings end up shifted by an odd number of bytes (which I also observed).

I guess “overwriting the last entry in the queue” at the moment the queue is being read could cause something like that.

Maybe switching it to “discard” mode would fix that, I’ll look when I have a chance.

Still, need to report errors because any kind of “cuts” in the data would mess up resonance readings.