Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix tools/ quiet build Makefile infrastructure that was broken when
working on tools/perf/ without testing on other tools/ living
utilities.
* tag 'perf-tools-fixes-for-v6.14-2-2025-02-25' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
tools: Remove redundant quiet setup
tools: Unify top-level quiet infrastructure
Pull rv and tools/rtla updates from Steven Rostedt:
- Add a test suite to test the tool
Add a small test suite that can be used to test rtla's basic features
to at least have something to test when applying changes.
- Automate manual steps in monitor creation
While creating a new monitor in RV, besides generating code from
dot2k, there are a few manual steps which can be tedious and error
prone, like adding the tracepoints, makefile lines and kconfig, or
selecting events that start the monitor in the initial state.
Updates were made to try and automate as much as possible among those
steps to make creating a new RV monitor much quicker. It is still
requires to select proper tracepoints, this step is harder to
automate in a general way and, in several cases, would still need
user intervention.
- Have rtla timerlat hist and top set OSNOISE_WORKLOAD flag
Have both rtla-timerlat-hist and rtla-timerlat-top set
OSNOISE_WORKLOAD to the proper value ("on" when running with -k,
"off" when running with -u) every time the option is available
instead of setting it only when running with -u.
This prevents rtla timerlat -k from giving no results when
NO_OSNOISE_WORKLOAD is set, either manually or by an abnormally
exited earlier run of rtla timerlat -u.
- Stop rtla timerlat on signal properly when overloaded
There is an issue where if rtla is run on machines with a high number
of CPUs (100+), timerlat can generate more samples than rtla is able
to process via tracefs_iterate_raw_events. This is especially common
when the interval is set to 100us (rteval and cyclictest default) as
opposed to the rtla default of 1000us, but also happens with the rtla
default.
Currently, this leads to rtla hanging and having to be terminated
with SIGTERM. SIGINT setting stop_tracing is not enough, since more
and more events are coming and tracefs_iterate_raw_events never
exits.
To fix this: Stop the timerlat tracer on SIGINT/SIGALRM to ensure no
more events are generated when rtla is supposed to exit.
Also on receiving SIGINT/SIGALRM twice, abort iteration immediately
with tracefs_iterate_stop, making rtla exit right away instead of
waiting for all events to be processed.
- Account for missed events
Due to tracefs buffer overflow, it can happen that rtla misses
events, making the tracing results inaccurate.
Count both the number of missed events and the total number of
processed events, and display missed events as well as their
percentage. The numbers are displayed for both osnoise and timerlat,
even though for the earlier, missed events are generally not
expected.
For hist, the number is displayed at the end of the run; for top, it
is displayed on each printing of the top table.
- Changes to make osnoise more robust
There was a dependency in the code that the first field of the
osnoise_tool structure was the trace field. If that that ever
changed, then the code work break. Change the code to encapsulate
this dependency where the code that uses the structure does not have
this dependency.
* tag 'trace-tools-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (22 commits)
rtla: Report missed event count
rtla: Add function to report missed events
rtla: Count all processed events
rtla: Count missed trace events
tools/rtla: Add osnoise_trace_is_off()
rtla/timerlat_top: Set OSNOISE_WORKLOAD for kernel threads
rtla/timerlat_hist: Set OSNOISE_WORKLOAD for kernel threads
rtla/osnoise: Distinguish missing workload option
rtla/timerlat_top: Abort event processing on second signal
rtla/timerlat_hist: Abort event processing on second signal
rtla/timerlat_top: Stop timerlat tracer on signal
rtla/timerlat_hist: Stop timerlat tracer on signal
rtla: Add trace_instance_stop
tools/rtla: Add basic test suite
verification/dot2k: Implement event type detection
verification/dot2k: Auto patch current kernel source
verification/dot2k: Simplify manual steps in monitor creation
rv: Simplify manual steps in monitor creation
verification/dot2k: Add support for name and description options
verification/dot2k: More robust template variables
...
Add osnoise_report_missed_events to be used to report the number
of missed events either during or after an osnoise or timerlat run.
Also, display the percentage of missed events compared to the total
number of received events.
If an unknown number of missed events was reported during the run, the
entire number of missed events is reported as unknown.
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/20250123142339.990300-4-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Add function collect_missed_events to trace.c to act as a callback for
tracefs_follow_missed_events, summing the number of total missed events
into a new field missing_events of struct trace_instance.
In case record->missed_events is negative, trace->missed_events is set
to UINT64_MAX to signify an unknown number of events was missed.
The callback is activated on initialization of the trace instance.
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/20250123142339.990300-2-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
All of the users of trace_is_off() passes in &record->trace as the second
parameter, where record is a pointer to a struct osnoise_tool. This record
could be NULL and there is a hidden dependency that the trace field is the
first field to allow &record->trace to work with a NULL record pointer.
In order to make this code a bit more robust, as record shouldn't be
dereferenced if it is NULL, even if the code does work, create a new
function called osnoise_trace_is_off() that takes the pointer to a
struct osnoise_tool as its second parameter. This way it can properly test
if it is NULL before it dereferences it.
The old function trace_is_off() is removed and the function
osnoise_trace_is_off() is added into osnoise.c which is what the
struct osnoise_tool is associated with.
Cc: John Kacur <jkacur@redhat.com>
Cc: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
Cc: Eder Zulian <ezulian@redhat.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Tomas Glozar <tglozar@redhat.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/20250115180055.2136815-1-costa.shul@redhat.com
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
When using rtla timerlat with userspace threads (-u or -U), rtla
disables the OSNOISE_WORKLOAD option in
/sys/kernel/tracing/osnoise/options. This option is not re-enabled in a
subsequent run with kernel-space threads, leading to rtla collecting no
results if the previous run exited abnormally:
$ rtla timerlat top -u
^\Quit (core dumped)
$ rtla timerlat top -k -d 1s
Timer Latency
0 00:00:01 | IRQ Timer Latency (us) | Thread Timer Latency (us)
CPU COUNT | cur min avg max | cur min avg max
The issue persists until OSNOISE_WORKLOAD is set manually by running:
$ echo OSNOISE_WORKLOAD > /sys/kernel/tracing/osnoise/options
Set OSNOISE_WORKLOAD when running rtla with kernel-space threads if
available to fix the issue.
Cc: stable@vger.kernel.org
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20250107144823.239782-4-tglozar@redhat.com
Fixes: cdca4f4e5e ("rtla/timerlat_top: Add timerlat user-space support")
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
When using rtla timerlat with userspace threads (-u or -U), rtla
disables the OSNOISE_WORKLOAD option in
/sys/kernel/tracing/osnoise/options. This option is not re-enabled in a
subsequent run with kernel-space threads, leading to rtla collecting no
results if the previous run exited abnormally:
$ rtla timerlat hist -u
^\Quit (core dumped)
$ rtla timerlat hist -k -d 1s
Index
over:
count:
min:
avg:
max:
ALL: IRQ Thr Usr
count: 0 0 0
min: - - -
avg: - - -
max: - - -
The issue persists until OSNOISE_WORKLOAD is set manually by running:
$ echo OSNOISE_WORKLOAD > /sys/kernel/tracing/osnoise/options
Set OSNOISE_WORKLOAD when running rtla with kernel-space threads if
available to fix the issue.
Cc: stable@vger.kernel.org
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20250107144823.239782-3-tglozar@redhat.com
Fixes: ed774f7481 ("rtla/timerlat_hist: Add timerlat user-space support")
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
If either SIGINT is received twice, or after a SIGALRM (that is, after
timerlat was supposed to stop), abort processing events currently left
in the tracefs buffer and exit immediately.
This allows the user to exit rtla without waiting for processing all
events, should that take longer than wanted, at the cost of not
processing all samples.
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/20250116144931.649593-6-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
If either SIGINT is received twice, or after a SIGALRM (that is, after
timerlat was supposed to stop), abort processing events currently left
in the tracefs buffer and exit immediately.
This allows the user to exit rtla without waiting for processing all
events, should that take longer than wanted, at the cost of not
processing all samples.
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/20250116144931.649593-5-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Currently, when either SIGINT from the user or SIGALRM from the duration
timer is caught by rtla-timerlat, stop_tracing is set to break out of
the main loop. This is not sufficient for cases where the timerlat
tracer is producing more data than rtla can consume, since in that case,
rtla is looping indefinitely inside tracefs_iterate_raw_events, never
reaches the check of stop_tracing and hangs.
In addition to setting stop_tracing, also stop the timerlat tracer on
received signal (SIGINT or SIGALRM). This will stop new samples so that
the existing samples may be processed and tracefs_iterate_raw_events
eventually exits.
Cc: stable@vger.kernel.org
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/20250116144931.649593-4-tglozar@redhat.com
Fixes: a828cd18bc ("rtla: Add timerlat tool and timelart top mode")
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Currently, when either SIGINT from the user or SIGALRM from the duration
timer is caught by rtla-timerlat, stop_tracing is set to break out of
the main loop. This is not sufficient for cases where the timerlat
tracer is producing more data than rtla can consume, since in that case,
rtla is looping indefinitely inside tracefs_iterate_raw_events, never
reaches the check of stop_tracing and hangs.
In addition to setting stop_tracing, also stop the timerlat tracer on
received signal (SIGINT or SIGALRM). This will stop new samples so that
the existing samples may be processed and tracefs_iterate_raw_events
eventually exits.
Cc: stable@vger.kernel.org
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/20250116144931.649593-3-tglozar@redhat.com
Fixes: 1eeb6328e8 ("rtla/timerlat: Add timerlat hist mode")
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
rtla timerlat hist currently computers the minimum, maximum and average
latency even in cases when there are zero samples. This leads to
nonsensical values being calculated for maximum and minimum, and to
divide by zero for average.
A similar bug is fixed by 01b05fc0e5 ("rtla/timerlat: Fix histogram
report when a cpu count is 0") but the bug still remains for printing
the sum over all CPUs in timerlat_print_stats_all.
The issue can be reproduced with this command:
$ rtla timerlat hist -U -d 1s
Index
over:
count:
min:
avg:
max:
Floating point exception (core dumped)
(There are always no samples with -U unless the user workload is
created.)
Fix the bug by omitting max/min/avg when sample count is zero,
displaying a dash instead, just like we already do for the individual
CPUs. The logic is moved into a new function called
format_summary_value, which is used for both the individual CPUs
and for the overall summary.
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/20241127134130.51171-1-tglozar@redhat.com
Fixes: 1462501c7a ("rtla/timerlat: Add a summary for hist mode")
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Pull tracing tools updates from Steven Rostedt:
- Add ':' to getopt option 'trace-buffer-size' in timerlat_hist for
consistency
- Remove unused sched_getattr define
- Rename sched_setattr() helper to syscall_sched_setattr() to avoid
conflicts
- Update counters to long from int to avoid overflow
- Add libcpupower dependency detection
- Add --deepest-idle-state to timerlat to limit deep idle sleeps
- Other minor clean ups and documentation changes
* tag 'trace-tools-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
verification/dot2: Improve dot parser robustness
tools/rtla: Improve exception handling in timerlat_load.py
tools/rtla: Enhance argument parsing in timerlat_load.py
tools/rtla: Improve code readability in timerlat_load.py
rtla/timerlat: Do not set params->user_workload with -U
rtla: Documentation: Mention --deepest-idle-state
rtla/timerlat: Add --deepest-idle-state for hist
rtla/timerlat: Add --deepest-idle-state for top
rtla/utils: Add idle state disabling via libcpupower
rtla: Add optional dependency on libcpupower
tools/build: Add libcpupower dependency detection
rtla/timerlat: Make timerlat_hist_cpu->*_count unsigned long long
rtla/timerlat: Make timerlat_top_cpu->*_count unsigned long long
tools/rtla: fix collision with glibc sched_attr/sched_set_attr
tools/rtla: drop __NR_sched_getattr
rtla: Fix consistency in getopt_long for timerlat_hist
rv: Fix a typo
tools/rv: Correct the grammatical errors in the comments
tools/rv: Correct the grammatical errors in the comments
rtla: use the definition for stdout fd when calling isatty()
The enhancements made to timerlat_load.py are intended to improve the script's exception handling.
Summary of the changes:
- Specific exceptions are now caught for CPU affinity and priority
settings, with clearer error messages provided.
- The timerlat file descriptor opening now includes handling for
PermissionError and OSError, with informative messages.
- In the infinite loop, generic exceptions have been replaced with
specific types like KeyboardInterrupt and IOError, improving feedback.
Before:
$ sudo python timerlat_load.py 122
Error setting affinity
After:
$ sudo python timerlat_load.py 122
Error setting affinity: [Errno 22] Invalid argument
Before:
$ sudo python timerlat_load.py 1 -p 950
Error setting priority
After:
$ sudo python timerlat_load.py 1 -p 950
Error setting priority: [Errno 22] Invalid argument
Before:
$ python timerlat_load.py 1
Error opening timerlat fd, did you run timerlat -U?
After:
$ python timerlat_load.py 1
Permission denied. Please check your access rights.
Cc: "lgoncalv@redhat.com" <lgoncalv@redhat.com>
Cc: "jkacur@redhat.com" <jkacur@redhat.com>
Link: https://lore.kernel.org/Q_k1s4hBtUy2px8ou0QKenjEK2_T_LoV8IxAE79aBakBogb-7uHp2fpET3oWtI1t3dy8uKjWeRzQOdKNzIzOOpyM4OjutJOriZ9TrGY6b-g=@protonmail.com
Signed-off-by: Furkan Onder <furkanonder@protonmail.com>
Reviewed-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Since commit fb9e90a67e ("rtla/timerlat: Make user-space threads
the default"), rtla-timerlat has been defaulting to
params->user_workload if neither that or params->kernel_workload is set.
This has unintentionally made -U, which sets only params->user_hist/top
but not params->user_workload, to behave like -u unless -k is set,
preventing the user from running a custom workload.
Example:
$ rtla timerlat hist -U -c 0 &
[1] 7413
$ python sample/timerlat_load.py 0
Error opening timerlat fd, did you run timerlat -U?
$ ps | grep timerlatu
7415 pts/4 00:00:00 timerlatu/0
Fix the issue by checking for params->user_top/hist instead of
params->user_workload when setting default thread mode.
Link: https://lore.kernel.org/20241021123140.14652-1-tglozar@redhat.com
Fixes: fb9e90a67e ("rtla/timerlat: Make user-space threads the default")
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Most fields of struct timerlat_top_cpu are unsigned long long, but the
fields {irq,thread,user}_count are int (32-bit signed).
This leads to overflow when tracing on a large number of CPUs for a long
enough time:
$ rtla timerlat top -a20 -c 1-127 -d 12h
...
0 12:00:00 | IRQ Timer Latency (us) | Thread Timer Latency (us)
CPU COUNT | cur min avg max | cur min avg max
1 #43200096 | 0 0 1 2 | 3 2 6 12
...
127 #43200096 | 0 0 1 2 | 3 2 5 11
ALL #119144 e4 | 0 5 4 | 2 28 16
The average latency should be 0-1 for IRQ and 5-6 for thread, but is
reported as 5 and 28, about 4 to 5 times more, due to the count
overflowing when summed over all CPUs: 43200096 * 127 = 5486412192,
however, 1191444898 (= 5486412192 mod MAX_INT) is reported instead, as
seen on the last line of the output, and the averages are thus ~4.6
times higher than they should be (5486412192 / 1191444898 = ~4.6).
Fix the issue by changing {irq,thread,user}_count fields to unsigned
long long, similarly to other fields in struct timerlat_top_cpu and to
the count variable in timerlat_top_print_sum.
Link: https://lore.kernel.org/20241011121015.2868751-1-tglozar@redhat.com
Reported-by: Attila Fazekas <afazekas@redhat.com>
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
glibc commit 21571ca0d703 ("Linux: Add the sched_setattr
and sched_getattr functions") now also provides 'struct sched_attr'
and sched_setattr() which collide with the ones from rtla.
In file included from src/trace.c:11:
src/utils.h:49:8: error: redefinition of ‘struct sched_attr’
49 | struct sched_attr {
| ^~~~~~~~~~
In file included from /usr/include/bits/sched.h:60,
from /usr/include/sched.h:43,
from /usr/include/tracefs/tracefs.h:10,
from src/trace.c:4:
/usr/include/linux/sched/types.h:98:8: note: originally defined here
98 | struct sched_attr {
| ^~~~~~~~~~
Define 'struct sched_attr' conditionally, similar to what strace did:
https://lore.kernel.org/all/20240930222913.3981407-1-raj.khem@gmail.com/
and rename rtla's version of sched_setattr() to avoid collision.
Link: https://lore.kernel.org/8088f66a7a57c1b209cd8ae0ae7c336a7f8c930d.1728572865.git.jstancek@redhat.com
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Commit e9a4062e15 ("rtla: Add --trace-buffer-size option") adds a new
long option to rtla utilities, but among all affected files,
timerlat_hist misses a trailing `:` in the corresponding short option
inside the getopt string (e.g. `\3:`). This patch propagates the `:`.
Although this change is not functionally required, it improves
consistency and slightly reduces the likelihood a future change would
introduce a problem.
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Tomas Glozar <tglozar@redhat.com>
Link: https://lore.kernel.org/20240926143417.54039-1-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Pull perf tools fixes from Namhyung Kim:
"Two fixes for building perf and other tools:
- Fix breakage in tracing tools due to pkg-config for
libtrace{event,fs}
- Fix build of perf when libunwind is used"
* tag 'perf-tools-fixes-for-v6.11-2024-07-23' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
perf dso: Fix build when libunwind is enabled
tools/latency: Use pkg-config in lib_setup of Makefile.config
tools/rtla: Use pkg-config in lib_setup of Makefile.config
tools/verification: Use pkg-config in lib_setup of Makefile.config
tools: Make pkg-config dependency checks usable by other tools
perf build: Warn if libtracefs is not found
When osnoise hist does not observe any samples above the threshold,
no entries are recorded and the final report shows empty entries
for the usual statistics (count, min, max, avg):
[~]# osnoise hist -d 5s -T 500
# RTLA osnoise histogram
# Time unit is microseconds (us)
# Duration: 0 00:00:05
Index
over:
count:
min:
avg:
max:
That could lead users to confusing interpretations of the results.
A simple solution is to report 0 for count and the statistics, making it
clear that no noise (above the defined threshold) was observed:
[~]# osnoise hist -d 5s -T 500
# RTLA osnoise histogram
# Time unit is microseconds (us)
# Duration: 0 00:00:05
Index
over: 0
count: 0
min: 0
avg: 0
max: 0
Link: https://lkml.kernel.org/r/Zml6JmH5cbS7-HfZ@uudg.org
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: John Kacur <jkacur@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Reviewed-by: John Kacur <jkacur@redhat.com>
Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
osnoise top performs background/font color formatting that could make
the text output confusing if not on a terminal. Use the changes from
commit f5c0cdad66 ("rtla/timerlat: Use pretty formatting only on
interactive tty") as an inspiration to fix this problem.
Apply the formatting only if running on a tty, and not in quiet mode.
Link: https://lkml.kernel.org/r/Zmb-yP_3EDHliI8Z@uudg.org
Suggested-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Reviewed-by: John Kacur <jkacur@redhat.com>
Reviewed-by: Clark Williams <williams@redhat.com>
Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Fix the following -Wformat-security compile warnings adding missing
format arguments:
latency-collector.c: In function ‘show_available’:
latency-collector.c:938:17: warning: format not a string literal and
no format arguments [-Wformat-security]
938 | warnx(no_tracer_msg);
| ^~~~~
latency-collector.c:943:17: warning: format not a string literal and
no format arguments [-Wformat-security]
943 | warnx(no_latency_tr_msg);
| ^~~~~
latency-collector.c: In function ‘find_default_tracer’:
latency-collector.c:986:25: warning: format not a string literal and
no format arguments [-Wformat-security]
986 | errx(EXIT_FAILURE, no_tracer_msg);
|
^~~~
latency-collector.c: In function ‘scan_arguments’:
latency-collector.c:1881:33: warning: format not a string literal and
no format arguments [-Wformat-security]
1881 | errx(EXIT_FAILURE, no_tracer_msg);
| ^~~~
Link: https://lore.kernel.org/linux-trace-kernel/20240404011009.32945-1-skhan@linuxfoundation.org
Cc: stable@vger.kernel.org
Fixes: e23db805da ("tracing/tools: Add the latency-collector to tools directory")
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The -t option has an optional argument.
The usual case is for a short option to be specified without an '='
and for the long version to be specified with an '='
Various forms of this do not work as expected.
For example:
rtla timerlat hist -T50 -tfile.txt
will result in a truncated file name of "ile.txt"
Another example is that the long form without the '=' will result in the
default file name instead of the requested file name.
This patch properly parses the optional argument with and without '='
and with and without spaces for the short form.
This patch was also tested using -t and --trace without providing a file
name both as the last requested option and with a following long and
short option.
For example:
rtla timerlat hist -T50 -t -u
rtla timerlat hist -T50 --trace -u
This fix is applied to both timerlat top and hist
and to osnoise top and hist.
Here is the full testing for rtla timerlat hist.
Before applying the patch
rtla timerlat hist -T50 -t=file.txt
Works as expected, "file.txt"
rtla timerlat hist -T50 -tfile.txt
Truncated file name "ile.txt"
rtla timerlat hist -T50 -t file.txt
Default file name instead of file.txt
rtla timerlat hist -T50 --trace=file.txt
Truncated file name "ile.txt"
rtla timerlat hist -T50 --trace file.txt
Default file name "timerlat_trace.txt" instead of "file.txt"
After applying the patch:
rtla timerlat hist -T50 -t=file.txt
Works as expected, "file.txt"
rtla timerlat hist -T50 -tfile.txt
Works as expected, "file.txt"
rtla timerlat hist -T50 -t file.txt
Works as expected, "file.txt"
rtla timerlat hist -T50 --trace=file.txt
Works as expected, "file.txt"
rtla timerlat hist -T50 --trace file.txt
Works as expected, "file.txt"
In addition the following tests were performed to make sure that
the default file name worked as expected including with trailing
options.
rtla timerlat hist -T50 -t
Works as expected "timerlat_trace.txt"
rtla timerlat hist -T50 --trace
Works as expected "timerlat_trace.txt"
rtla timerlat hist -T50 -t -u
Works as expected "timerlat_trace.txt"
rtla timerlat hist -T50 --trace -u
Works as expected "timerlat_trace.txt"
Link: https://lkml.kernel.org/r/20240515183024.59985-1-jkacur@redhat.com
Cc: Daniel Bristot de Oliveria <bristot@kernel.org>
Signed-off-by: John Kacur <jkacur@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
On short runs it is possible to get no samples on a cpu, like this:
# rtla timerlat hist -u -T50
Index IRQ-001 Thr-001 Usr-001 IRQ-002 Thr-002 Usr-002
2 1 0 0 0 0 0
33 0 1 0 0 0 0
36 0 0 1 0 0 0
49 0 0 0 1 0 0
52 0 0 0 0 1 0
over: 0 0 0 0 0 0
count: 1 1 1 1 1 0
min: 2 33 36 49 52 18446744073709551615
avg: 2 33 36 49 52 -
max: 2 33 36 49 52 0
rtla timerlat hit stop tracing
IRQ handler delay: (exit from idle) 48.21 us (91.09 %)
IRQ latency: 49.11 us
Timerlat IRQ duration: 2.17 us (4.09 %)
Blocking thread: 1.01 us (1.90 %)
swapper/2:0 1.01 us
------------------------------------------------------------------------
Thread latency: 52.93 us (100%)
Max timerlat IRQ latency from idle: 49.11 us in cpu 2
Note, the value 18446744073709551615 is the same as ~0.
Fix this by reporting no results for the min, avg and max if the count
is 0.
Link: https://lkml.kernel.org/r/20240510190318.44295-1-jkacur@redhat.com
Cc: stable@vger.kernel.org
Fixes: 1eeb6328e8 ("rtla/timerlat: Add timerlat hist mode")
Suggested-by: Daniel Bristot de Oliveria <bristot@kernel.org>
Signed-off-by: John Kacur <jkacur@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Add the option allow the users to set a different buffer size for the
trace. For example, in large systems, the user might be interested on
reducing the trace buffer to avoid large tracing files.
The buffer size is specified in kB, and it is only affecting
the tracing instance.
The function trace_set_buffer_size() appears on libtracefs v1.6,
so increase the minimum required version on Makefile.config.
Link: https://lkml.kernel.org/r/e7c9ca5b3865f28e131a49ec3b984fadf2d056c6.1715860611.git.bristot@kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
While the per-cpu values are the results to take into consideration, the
overall system values are also useful.
Add a summary at the bottom of rtla timerlat top showing the overall
results. For instance:
Timer Latency
0 00:00:10 | IRQ Timer Latency (us) | Thread Timer Latency (us)
CPU COUNT | cur min avg max | cur min avg max
0 #10003 | 113 19 150 441 | 134 35 170 459
1 #10003 | 63 8 99 462 | 84 15 119 481
2 #10003 | 3 2 89 396 | 21 8 108 414
3 #10002 | 206 11 210 394 | 223 21 228 415
---------------|----------------------------------------|---------------------------------------
ALL #40011 e0 | 2 137 462 | 8 156 481
Link: https://lkml.kernel.org/r/5eb510d6faeb4ce745e09395196752df75a2dd1a.1713968967.git.bristot@kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Suggested-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>