Klipper Log Visualizer

I have published a complete rewrite of the Klipper log graphing tool:

  • Updated parser to hopefully now catch all edge cases that had been causing weird jumps in the timeline.
  • Improved session management.
  • Synchronized Zoom – this means when one graph is zoomed, all other graphs will follow. Note: On big logs this can be a major performance hog, as all graphs need to be fully recalculated. Best used with session filtering to concentrate on the data of interest.
    • Just to get a feeling: A 10 - 12 MB log file can contain around 10,000 stats lines and, depending on the number of MCUs or CAN nodes, with around 80 values per line → 800,000 points.
  • Integrated CAN bus statistics.
  • More modern look.
  • Known issue: Double-clicks for resetting a zoom are not always picked up reliably. Unfortunately, I have not the slightest clue why. If anyone knows, I’d appreciate it.

Same place: https://sineos.github.io/

Feedback on issues etc is, of course, highly welcome.

Edit:

Pushed some smaller changes:

  • Added bytes_retransmit and bytes_invalid to the graphs.
  • Removed upcoming_bytes as it is not so relevant.
  • Added generic temperature sensors.
  • Switched to the Plotly basic library for faster initial loading.
18 Likes

This is fantastic, thank you very much!

2 Likes

Great work, well done!

1 Like

I tried this huge klippy.log MCU randomly disconnect - #7 by erk using your Visualizer with different browsers.

Firefox: very slow
Chrome: ok
Edge: ok

Firefox may be a bit (not very) slow, but it has not that much security concerns than the other two. Also, I have way more possibilities to set it up to my needs.

Don’t get me wrong Eddy. I think Sineos tool is awsome. I use it almost every day since the first version.
What I wrote above was not very “engineering like”. It was just my quick impression on MY computer.

I have three kind wishes for future versions, if easily possible:

A field beside the right here, where you can copy and paste the “klippy.log time code value” (format: line number in klippy.log) I’m interested in and the marker jumps to that part in the graph and zooms in around that time code. I guess there would be a need to suggest the editors to use for editing klippy.log.

I would like to have a correlation to the “klippy.log time code”. What about two time lines on the x axis?
Real time code and “klippy.log line number” below.

Make the older versions available to us on GitHub - Sineos/sineos.github.io sometimes they are useful. A version number would also be helpful.

2 Likes

Yes, it is known that Firefox is slowest is such tasks.

Well, with my coding “skills” (cough), nothing is easy. But I intend to extend it a bit, especially regarding the cross-reference to the actual log. Not sure if this will work, because after parsing, the log is basically discarded and only the graphing data retained. Otherwise, I’d need to keep all, also the “non-stats noise” like configs, etc.

Why would you want to actively retain the old version?

1 Like

Updated version is online:

  • Introduces a Log Explorer tab with a log line viewer that displays context around selected chart points.
  • Refactors dashboard session filtering to use the web worker, adds middle-click support for chart points to open the log viewer, and updates styles for tabbed navigation and log viewing.
  • Optimizes log parsing and metric construction for performance and consistency.

Tests and feedback welcome.

5 Likes

Btw, as you do rewrite the tool.
My small, strange suggestion, that it is possible to extract the CANBUS_FREQUENCY and SERIAL_BAUD and match it with the MCU, and as a consequence, guess the actual available baudrate.
*Not something urgent or required, just an idea.

Thanks.

Not sure that I understand your request, @nefelim4ag. Can you elaborate?

1 Like

Log parser assumes bandwidth to be equal to 250_000 bits per second ~ 25kbyte of usable data.

But in reality, it would depend on the configuration of the board; The number could be guessed, but I hope it is good enough to roughly estimate it from the board configuration. Like 1/10 of the serial baud rate is the actual byte bandwidth. So 250_000 ~= 25_000 bytes per second. For the CAN it is sligtly more complicated: klipper/klippy/chelper/serialqueue.c at master · Klipper3d/klipper · GitHub
Probably ~ 500_000 / (47 + 64) * 64 / 8 = 36Kb/s

It is just a suggestion.

Thanks.

1 Like

I like the proposal.
It seems challenging, as each session would need to be parsed individually. Would you agree that:

  • CANBUS_FREQUENCY can only be changed with a service restart → Log is reset
  • SERIAL_BAUD can change within one log, since it only needs a FIRMWARE_RESTART

Edit:
In addition wouldn’t we have to differentiate between

  • UART with 250_000 bits per second
  • UART over USB also with 250_000 bits per second
  • Native USB (likely 12Mbit/s full-speed USB as negotiated between host and device)

in order to get a meaningful scaling?

IIRC:
If you see baud rate in the log, it is fixed inside the MCU and should be accurate.
You can’t change the host config to a different speed because the baud rate is fixed on the MCU side.

I think there will be no difference if there is a USB2TTL device in between (we still limited by the MCU).

$ cat out/klipper.dict | jq . | grep -i SERIAL_BAUD
    "SERIAL_BAUD": 250000,

In the case where the MCU emulates UART/CAN, there is no constant in the log, so no limit. The only tricky part is if the MCU works as the CAN BRIDGE.

$ cat out/klipper.dict | jq . | grep -i CANBUS_BRIDGE
    "CANBUS_BRIDGE": 1,

But in this case, the constant seems to be also absent, so probably RPI ↔ USB ↔ MCU have the same performance as general USB2Something (12Mbit).

Mmmm, maybe just limit from the above to 1Mbit? Assume USB 1.0 as the worst case.

I also add that for UART, there is an independent TX and RX bandwidth.
(so 250k for TX and 250k for RX).
But for USB/CAN, it is a shared bus, so the bandwidth is shared between RX/TX data.

Hope that helps.

Thanks.

Thanks for thinking along @nefelim4ag

Well, we have the baud: setting in the config, so I’d assume the user can override it.

My current interpretation would be:

  • baud setting is applicable for direct UART and for MCUs that use a USB2TTL chip (make menuconfig → Serial)
  • On native USB connections (make menuconfig → USB), the limit is the negotiated OS-level USB connection, but I have no clue on the achievable net data rate. I guess there is some overhead and packet limit, as with CAN. Likely, we can assume “full speed USB” today.
  • CAN depends on the set CAN frequency
  • USB2CAN Bridge seems tricky, but I’d guess:
    • Host ↔ Bridge: USB as above
    • Bridge ↔ Node: CAN frequency
MCU 'mcu' config: ADC_MAX=4095 BUS_PINS_i2c1_PA9_PA10=PA9,PA10 BUS_PINS_i2c1_PB6_PB7=PB6,PB7 BUS_PINS_i2c1_PB8_PB9=PB8,PB9 BUS_PINS_i2c2_PB10_PB11=PB10,PB11 BUS_PINS_i2c2_PB13_PB14=PB13,PB14 BUS_PINS_i2c3_PB3_PB4=PB3,PB4 BUS_PINS_i2c3_PC0_PC1=PC0,PC1 BUS_PINS_spi1=PA6,PA7,PA5 BUS_PINS_spi1a=PB4,PB5,PB3 BUS_PINS_spi2=PB14,PB15,PB13 BUS_PINS_spi2_PB2_PB11_PB10=PB2,PB11,PB10 BUS_PINS_spi2a=PC2,PC3,PB10 BUS_PINS_spi3=PB4,PB5,PB3 CANBUS_BRIDGE=1 CLOCK_FREQ=64000000 MCU=stm32g0b1xx PWM_MAX=255 RECEIVE_WINDOW=192 RESERVE_PINS_CAN=PD0,PD1 RESERVE_PINS_USB=PA11,PA12 RESERVE_PINS_crystal=PF0,PF1 STATS_SUMSQ_BASE=256 STEPPER_BOTH_EDGE=1

Would it be valid to state that an MCU net data rate can be calculated depending on the configuration in the klippy.log:

  • RESERVE_PINS_serial: (baud\_setting\_in\_mcu ||250\_000) \div 10
  • RESERVE_PINS_USB: 12\_000\_000 \div TBD
  • CANBUS_FREQUENCY: CANBUS\_FREQUENCY \div (47 + 64) \times 64 \div 8

Whereas for USB’s TBD, I have no clue.

1 Like

As far as I know, the most accurate way to determine the available bandwidth is to follow what the klippy/chelper/serialqueue.c code does. It looks at the constants reported by the MCU (eg, MCU 'mcu' config: ADC_MAX=4095 ...) and utilizes the SERIAL_BAUD or CANBUS_FREQUENCY settings. (Only one of those two values may be present, but it is possible neither is present.)

If SERIAL_BAUD is present then the available application bytes per second is SERIAL_BAUD / 10, and that bandwidth is available simultaneously in both directions. The printer.cfg mcu baud setting is not authoritative (that option determines how Klipper tells Linux to configure the port, but various hardware devices can ignore that setting - in contrast, the MCU reported value is authoritative).

If CANBUS_FREQUENCY is set, then the maximum available application bytes per second can be estimated as 8 * CANBUS_FREQUENCY / (64 + 47). However, this can only really be an estimate - traffic to other nodes, response traffic from nodes, and canbus hardware “bit stuffing” bits all reduce the available data that any one node can utilize. (If the MCU reports a value here, though, then it is an authoritative upper limit on available application bytes.)

If neither is set then we don’t really know the available bandwidth. This will occur for USB connections, CANBUS bridge USB connections, “linux mcu”, beaglebone “pru” connections, and similar. In this case, if one really wanted to get a rough estimate, one could look at the benchmarks Benchmarks - Klipper documentation . Note, that these benchmarks are reported in “commands per second” - to get to “application bytes per second” one would use benchmark_number / 59 * 64.

FWIW, for USB connections, I suspect graphing the bytes received relative to the theoretical USB bandwidth may not be that useful. I suspect the resulting graph would basically always show a line under ~5%, and it may be visually difficult to determine when a lot of data is being sent/received. If an unreasonably large amount of data was sent for some reason, almost certainly something would fail long before one got close to the benchmarked rates.

Maybe that helps a little,
-Kevin

1 Like

Thanks a lot, Kevin. This surely helps my understanding, but for a more accurate scaling of the graph, it feels as per the famous (at least in Germany) “Faust” (1808) by Johann Wolfgang von Goethe, which somehow translates to:

And here I stand, with all my lore,
Poor fool, no wiser than before.

:sweat_smile:

I started testing under Windows.
Firefox, Chrome, and Edge don’t work with/for your new Log Explorer feature. Looks like your program doesn’t react on my middle-click.

It took me a while to try above under Ubuntu since I had no machine with Linux installed handy.

Now I do. Works very well and stable with every browser mentioned above.

I would suggest another mouse-click. Maybe something like left-click + Alt or something. Nothing involving the middle mouse button.

Very cool feature. I’m curious what will happen with the new suggested features. Thank you.

edit:

I tried wit a bigger log

There is a big performance difference between Firefox and Chrome compared to Edge.

As you stated above

My impression is, Firefox and Chrome don’t differ much. Edge performs best!

edit again:

There is another thing with the zooming feature which is funny. But I have to go to bed and get back to it.

Thanks for testing. Appreciated.

Unfortunately, I cannot reproduce this behavior. It works for me on 3 different computers with all browsers.

Do you maybe have some special mouse driver installed or you are overriding the middle mouse button somehow? Anything in the browser’s console?

Is anyone else experiencing this?

Resumption.

I tried a larger klippy.log

Pretty harsh with my very old i3 with 4GB :joy:, takes forever.

Bacheshatonee stated this error:

MCU ‘mcu’ shutdown: Missed scheduling of next digital out event
I see this at line #51705

I wanted to find that line using your new Log Explorer feature.
Impossible with this log. These were the next points around line 51705.

Assembly!

When I mark the next left marker, I get line 50461 to 50661.
When I mark the next right marker, I get line 50465 to 50665 (did that one twice).When I mark the next right marker, I get line 50462 to 50662.

I don’t understand the result from the right marker.

When I switch from the zoomed “Log Explorer” tab back to the “Dashboard” tab, the zoom is back to “Autoscale”. If possible, I would avoid that.

Your program runs very stable, sometimes “I had the impression” it hangs, but it was my fault. Users just need to “train” a bit :wink:

I don’t have another Windows machine here. I’ll try tomorrow at work, if my admins allow “https://sineos.github.io” and get back.

Works! My machine has a Dell driver installed and I can’t get writ of it.