A few years ago I put together some experimental code to calibrate the FOPDT parameters for heaters. Unfortunately, the experimental code did not work well at all - it was unable to adequately respond to changes in the environment. However, I did find the calibration code (resulting from a new HEATER_CALIBRATE command) did a remarkably good job of reliably measuring the FOPDT parameters (even though the experimental control code could not effectively utilize it).
I spent a considerable amount of time in the past researching PID tuning techniques, initially prompted by the hot end temperature oscillations that I have seen associated with the (now reduced) first-order derivative filter time constant in Klipper. This progressed into manually tuning my heated bed constants because, unlike the hot-end, the Klipper built-in relay method did not produce optimal results, producing large-ish overshoot and somewhat long settling times.
Ultimately I ended-up performing a series of manual response step-tests, deriving the FOPD model parameters and defining the PID constants using the Cohen-Coon method. I now have rock solid and, dare I say, “beautiful” bed response with no more than 0.2 degree overshoot, fast response, and zero stability issues.
I keep meaning to document the method and to post it here, but unfortunately other priorities keep interfering.
I finally posted a “walk through” for one possible approach to manual PID tuning:
Please feel free to critique or correct any errors or omissions. Hopefully it will help some people, particularly in some of the corner cases that seem to exist where bed temperatures do not converge (although some of these cases may have other systemic causes).
(Note I edited the below because when I wrote the original content I was being very distracted by my 8 year old and some of it was totally out to lunch )
These values are obviously somewhat different from what I obtained manually. The overall process gain is quite close though!
There is a difference in the process time constant. My manual method only uses the heating “side” of the process, not the cooling side. On my printer the cooling of the bed occurs at a substantially slower rate than heating, because the bottom surface of the bed heater is insulated and the bed has a magnet and a flex plate on it. This creates a very asymmetrical “single-acting” system or process. Again, I have not looked at the code, but I noticed that the HEATER_CALIBRATE command was waiting for the bed to cool down before calculating the process constants.
I am not sure what to say about the delay_time calculation, the attached data shows about 2-3 seconds delay in the sensor response to the heater command, shorter on the heating side of course than on the cooling side. But even when playing with it manually, there is a large variability in the delay / dead time. This is likely not an issue anyway because the overall process time constant is two orders of magnitude larger than the dead time (on my printer).
Thanks very much again for all the time you devote to making Klipper the absolutely best 3D printer firmware! I also wanted to emphasize again that I am by no means a PID control loop expert. Not even close…
At first glance, it does seem the delay should be slightly higher and the gain slightly higher.
However, I suspect the reason for the mismatch is that your bed temperature sensor is close to your heater and the code is detecting that the sensor values are actually higher than the “bed temperature as a whole”.
That said, I’m not sure if the proposed values would be better or worse in practice.
Separately, yes, cooling is used during the determination of the time_constant. As far as I know, it is valid to do that.
Thanks very much Kevin, I like the plot as it illustrates very well the difference in heating and cooling slopes.
I tried looking through the relevant code a bit, but quickly got lost in the details. Primarily I wanted to confirm that the process gain is being expressed in degrees C over normalized power, rather than normalized temperature over normalized power like it is in the PID controller. Out of curiosity what mechanism does the code have to determine that the sensor is installed on or near the heater? Does it look at the differences in heating vs. cooling dead time perhaps?
Just for kicks, since I had your FOPDT branch active, I run the bed with FOPDT control using the auto calibrated values and then the process values that I obtained manually. This is what I got:
In both cases the control loop seemed to be a bit reluctant to zero-in on the target temperature. With the auto-tuned process coefficients it slightly overshot and then took several minutes to get onto 70.0, slowly approaching it from below. With the hand-tuned coefficients it arrived at 70.0 very quickly, but then a minute or so later drifted up to 70.1 and did not seem to want to go back down…
I understand from my research that pulse testing can be as valid as step testing in characterising the process. One aspect that goes beyond my understanding, though, is how the various methods are affected by the differences between single-acting (like heating only) vs. double-acting (like heating & cooling capability) processes. Particularly in cases of highly asymmetrical processes like my bed heating vs cooling slopes where heating is fast and is the only method to actively control the process, while cooling is much slower and “passive”.
Coincidentally, I came across an interesting paper on subject of FOPDT process curve fitting:
The gain is the number of degrees above the ambient temperature that the heater would approach if you left it on full power for a long time. It is a measure of how powerful the heater is relative to how much heat the bed dissipates to the environment. So, with an ambient temperature of 25 degrees, a bed with a gain of 220 would approach a temperature of 245 degrees if left full on. If set to 50% power, it would approach 135 degrees.
The time_constant is a measure of the “thermal mass” of the bed - a higher value means it takes more power to increase the bed temperature. A bed with a high “thermal mass” also requires a long time to cool.
The key formula (omitting delay) is: expected_temperature = ambient_temperature + heater_pwm * gain * (1. - exp(-heat_time / time_constant))
And, thus, if heat_time is large then that simplifies to: expected_temperature = ambient_temperature + heater_pwm * gain
I’m not sure I understand your comments about “heating vs cooling slopes”. Fundamental to the FOPDT model is the idea that a heater has a thermal mass that is independent of heating vs cooling. The HEATER_CALIBRATE code calculates that thermal mass using measurements during both heating and cooling. (The gain calculation only uses measurements during heating, as there is no gain when the heater is off.)
Thanks for simplifying the algorithm, it helps to understand the approach.
I do in fact understand quite well the FOPDT process constants, including gain, and how to derive them by hand by evaluating the process response to step input. My confusion with the gain use in Klipper was largely self induced and it was related to PID_PARAM_BASE in the code. I have previously looked at the code, obviously superficially, and I noticed that the PID constants stored in printer.cfg were being divided by 255 prior to use in the PID control loop. This led me to believe that the PID loop was using a non-dimensional value of temperature that was scaled on 255. On the other hand I noticed that the FOPDT code was using actual temperature in “engineering” units. I have since corrected my erroneous assumptions and also revised my manual derivation post. Coincidentally and for obvious reasons it made no difference in the final PID constant values, it just affected the point at which the 255 factor was applied in the approach.
Going back to gain, I think that the biggest reason for the difference between my manually derived step-based gain and the FOPDT pulse method originates in the fundamental approach. Specifically, the manual method calculates what I would call “local” process gain based on response to small input around the target operating point, in my case 70 deg.C. This value of gain will be different for the same step input at a different target operating point, becoming lower as that target is increased.
With respect to my comment about heating vs. cooling slopes I realize that I was not sufficiently clear. I was trying to simply highlight that when using a step test, the process time constant and perhaps dead time results could be different depending on whether a positive step or a negative step is used in a highly asymmetrical process. A pulse test would produce yet another somewhat different set of process constants, perhaps more representative of the overall process.
Out of these three widely used first-order process characterization approaches, which one would be the most representative and most applicable to a “single acting” process such a 3D printer heater system? Or perhaps any such differences would be trivial in nature. I smell another possible experiment coming when I find some free time.
I am not sure if the above makes sufficient sense to you.
Ah, yeah, the 255 is in there because other firmware (eg, Marlin) do that - when Klipper was first released it didn’t have the PID_CALIBRATE command and being somewhat compatible with other firmware was sometimes helpful. It’s an annoying legacy thing. It would be hard to change now as it would break everyone’s current config. (So, not worth the havoc.)
As far as I know, the fopdt HEATER_CALIBRATE tool in this experimental branch works as intended. It might be worth merging into the master branch. (As a tool to help users manually calculate PID parameters.)
I did notice that your manual method uses a power rate of 20%. That may be a better testing mechanism that the full-on system the tool above uses. (In particular, with high power / high thermal mass heaters like your bed.) It’s possible to get the same results by temporarily lowering the max_power setting to 0.20 in the heater config - but adding a command-line option might be worthwhile to implement.
I suspected that the 255 was there for commonality with Marlin, since the multiplication and division is occurring in the code on both sides of the configuration file access.
I think having an automated “advanced” command to obtain reasonably accurate FOPDT process parameters would substantially simplify any customisation of the PID constants. I think including it is a great idea. To that point, I have been experimenting with various methods of process characterization and I noted that obtaining a reasonably accurate process “dead time” value is a big challenge. The online pidtuner.com fitting often produces unpredictable results, and the dead time value is absolutely critical in defining the gain related PID constants.
I am currently evaluating the method described here: FOPDT Modeling that seems to be very promising and leans itself to reasonably easy automation. This could potentially be a much more accurate and repeatable method than my manual “step test”. If you have a moment (which I am sure you do not), it would make for an interesting read. I will post my results here once I have the complete data set from my experiments with the hot end PID tuning.
NB - my basic approach to the step test relies on first determining the average PWM drive associated with the desired temperature set point, then introducing a small step change to that PWM drive. As I suspected the process gain is the most affected by using this approach, since it produces gain specific to the desired operating point. The process time constant and dead time are not that significantly affected.
I’m not sure what you are referring to at that link - there seems to be several papers there. FYI, there is overlap between what is described there and what the HEATER_CALIBRATE command already implements. For example, in one of the papers it says This regression method seeks to fit the model to all data points (after the delay, actually N-minus-Ndelay number of data points), not just the selected several initial, final, and two intermediate points in a parametric fit. So, it better rejects noise and disturbances over classical reaction curve or parametric approaches. - which is a good description of how HEATER_CALIBRATE works.
Although HEATER_CALIBRATE might seem complex, it’s actually quite simple. The idea is, if we have a good model for the heater we should be able to predict the heater temperature just by knowing when and how much PWM power has been applied to the heater. That is, if we have an appropriate model (eg, FOPDT), good parameters for the model (eg, gain, time_consant, delay_time), then if we run a heating test the results from just the model parameters should closely match the measured values. That’s the key to how the HEATER_CALIBRATE command works - we take a guess at the fopdt parameters, plot out what the temperature would be with those parameters, and compare it to the actual measurements. We then try repeatedly tweaking the guesses until the model results closely match the actual measured temperatures. The process of continually tweaking parameters until we find the best is known as coordinate descent. It’s also known as an “optimization” or “minimization” process.
FWIW, I found that HEATER_CALIBRATE did a very good job of modelling real-world heaters. If you look at the graph I posted above, the red line is the temperatures based purely on the model and the green line is the actual measurements. The test lasted around 15 minutes, and the model was never more than a couple of degrees off from actual, at any time during the test. That’s pretty impressive!
My apologies, the point I was trying to make was insufficiently clear once again. What specifically peaked my interest was the skyline input function that the author claims “has advantages in operational duration, magnitude of upsets, and number of excitations over classical methods”. This is in the context of my difficulty obtaining accurate process dead time using one of those “classical methods”.
NB, I totally agree that the HEATER_CALIBRATE model that I tried from your experimental branch shows a very good correlation to actual data. I just tend to always be on a lookout for alternative analytical methods and techniques.
Coincidentally, one of my academic interest areas back in the early 90s was numerical optimisation, specifically multi-dimensional aviation propeller geometry optimisation. Back then I used the Davidon-Fletcher-Powell method with exterior penalty function constraints.
That is interesting. I agree that changing input (instead of just the basic “bump test”) could improve the results.
One of the things I had considered in the past was extending the existing PID_CALIBRATE “relay test” to also calculate the fopdt values. The “skyline” method has some similarities to the “relay” method (though relay may be too simplistic). The main reason I didn’t alter PID_CALIBRATE is that the current fopdt calculation math can take a long time to process.
Have you guys looked into how Duet/RRF is doing it? They’ve done a lot of stuff with FOPDT tuning in their latest release and it’s been working pretty darn great since the latest RRF 3.4 release (had some similar issues you guys are discussing before the last few fixes):
It also behaves pretty nicely on environmental changes (fan).