These are just my thoughts. I think, in general, there should be no problem with CPU congestion on modern Linux.
So, there are tools to allow Klippy to use the CPU as needed, regardless of other services on the SBC.
Test setup: RPI 5, Debian 13 (trixie).
Dummy service:
# /etc/systemd/system/klipper-test.service
[Unit]
After=network-online.target
Wants=udev.target
[Install]
WantedBy=multi-user.target
[Service]
Type=simple
User=user
WorkingDirectory=/tmp
ExecStart=/usr/bin/stress-ng --cpu 1 -t 0
SystemCallFilter=@known
So, when I run it, I expect it to use 100% of one CPU Core.
When I run CPU intensive process as the ssh user.
Systemd modifies CPU weights somewhere, so we could expect that occasional make -j4
should not crash Klipper with TTC.
Let’s run it as a systemd service, assume it is a WebCam stream, KlipperScreen or anything.
$ sudo systemd-run stress-ng --cpu 4
Running as unit: run-p16511-i16811.service
Now they share CPU time.
Let’s renice: sudo renice -n 5 -p 16513
(repeat for each process)
I’m a little surprised, it is works, but okay
Surprised because
because it may not work, depending on the autogroup decision.
/proc/16416/autogroup:/autogroup-4154 nice 0
/proc/16417/autogroup:/autogroup-4154 nice 0
/proc/16512/autogroup:/autogroup-4178 nice 0
/proc/16513/autogroup:/autogroup-4178 nice 0
/proc/16514/autogroup:/autogroup-4178 nice 0
/proc/16515/autogroup:/autogroup-4178 nice 0
/proc/16516/autogroup:/autogroup-4178 nice 0
So, last time when I worked around this, this almost never worked predictably.
Without disabling autogroups: sysctl kernel.sched_autogroup_enabled=0
Like, according to Autogroup, it should split time equally between groups.
But this does not happen.
When we add a nice, it should renice only inside the autogroup.
So, nothing should happen, but it does right now.
That sort of thing. On the desktop it is much more tricky - more services, more processes, more groups.
Let’s try another approach. Now, we will use the CGroups integration of systemd.
[Service]
...
CPUWeight=1000
The basic idea is simple: services have weights, and CPU is shared equally according to weights; the default is 100.
If the CPU is congested, it will basically sum all weights and give each service the slice equal to the part of the sum of weights in the CGroup tree.
If not congested, then the process can use whatever it wants (no CPUQuota).
sudo systemctl stop run-p16511-i16811.service
<edit> /etc/systemd/system/klipper-test.service
sudo systemctl daemon-reload
sudo systemctl restart klipper-test.service
sudo systemd-run stress-ng --cpu 4
Running as unit: run-p17082-i17382.service;
And it again has enough time to do its own job.
To get some idea of how the tree is looking: systemd-cgtop
can provide a view.
CGroup Tasks %CPU Memory Input/s Output/s
/ 223 400.6 445.4M - -
system.slice 58 398.9 - - -
system.slice/run-p17082-i17382.service 5 297.9 - - -
system.slice/klipper-test.service 2 99.9 - - -
user.slice 11 0.7 - - -
user.slice/user-1000.slice 11 0.7 - - -
user.slice/user-1000.slice/session-84.scope 9 0.7 - - -
system.slice/klipper.service 14 0.7 - - -
system.slice/moonraker.service 11 0.4 - - -
system.slice/klipper-mcu.service 1 0.0 - - -
system.slice/NetworkManager.service 3 0.0 - - -
init.scope 1 - - - -
So, we can expect that services in system.slice will share their CPU time according to weight.
System slice has the same weight as the user slice (where the ssh user runs commands).
So, if it tries to use more than 50% of the CPU, when the user tries to do the same. Time will be equally split.
$ grep . /sys/fs/cgroup/{user,system}.slice/cpu.weight
/sys/fs/cgroup/user.slice/cpu.weight:100
/sys/fs/cgroup/system.slice/cpu.weight:100
Because in the first scenario, our dummy service is trying to use only 25% of the CPU, it does not experience any congestion.
I expect Klipper to use less than 1 core time under normal circumstances, so I hope it is a fair test.
My summary, I think it would be useful to set weight to the klipper.service in the SBC setups, where several heavy things can use all available resources.
Maybe, if one specialized SBC per printer is expected setup, so there is a klipper and other low priorities processes. It can make sense to set it to some high value by default.
Hope that helps someone.