There are no benchmarks or anything about CFFI ABI vs API on the internet.
I’ve tried to do some strange performance “optimizations” in different places.
This is just one of them for the public record.
Generally, it seems that a CFFI call would cost ~120ns ABI and for ~90ns for API.
So, if calls are “long” enough, the returns are diminishing.
TLDR, There is almost no performance difference (Python 3.13).
...
self.monotonic = chelper.get_ffi()[1].get_monotonic
+ start_time = time.perf_counter()
+ for _ in range(100000):
+ result = self.monotonic()
+ cffi_duration = time.perf_counter() - start_time
+ logging.info(f"monotonic call %.9f" % (cffi_duration/100_000))
Raspberry PI 5 with 2GHz fixed frequency.
# CFFI ABI
monotonic call 0.000000282
monotonic call 0.000000282
monotonic call 0.000000283
monotonic call 0.000000283
# CFFI API
monotonic call 0.000000244
monotonic call 0.000000261
monotonic call 0.000000255
monotonic call 0.000000261
monotonic call 0.000000262
monotonic call 0.000000256
monotonic call 0.000000258
monotonic call 0.000000259
It does compile the shared library with the Python compilation flags, which can negatively impact the overall performance. In my case, the batch mode on my laptop is slower by ~3%.
Just in case, if anyone is curious, branch: Commits · nefelim4ag/klipper · GitHub
The only issue that I probably didn’t solve is the GCC output redirection to the klippy’s log file.
But it seems to work correctly when Klippy started normally (not in batch mode).
Thanks.
Btw, I did try the __slots__ for the Reactor objects. It seems to provide zero attribute access performance benefits on recent 3.9+ Python versions.