Getting started with systemtap for Linux system profiling

Intro to SystemTap

SystemTap is a tracing and probing tool that allows users to study and monitor the activities of the operating system (particularly, the kernel) in fine detail. It provides information similar to the output of tools like netstat, ps, top, and iostat; however, SystemTap is designed to provide more filtering and analysis options for collected information.

SystemTap can be used by system administrators as a performance monitoring tool for Red Hat Enterprise Linux 5 or later. It is most useful when other similar tools cannot precisely pinpoint a bottleneck in the system, thus requiring a deep analysis of system activity. In the same manner, application developers can also use SystemTap to monitor, in fine detail, how their application behaves within the Linux system.

SystemTap was originally developed to provide functionality for Red Hat Enterprise Linux similar to previous Linux probing tools such as dprobes and the Linux Trace Toolkit. SystemTap aims to supplement the existing suite of Linux monitoring tools by providing users with the infrastructure to track kernel activity. In addition, SystemTap combines this capability with two attributes:

  • Flexibility: SystemTap’s framework allows users to develop simple scripts for investigating and monitoring a wide variety of kernel functions, system calls, and other events that occur in kernel space. With this, SystemTap is not so much a tool as it is a system that allows you to develop your own kernel-specific forensic and monitoring tools.
  • Ease-of-Use: as mentioned earlier, SystemTap allows users to probe kernel-space events without having to resort to the lengthy instrument, recompile, install, and reboot the kernel process.

Understanding how SystemTap works

SystemTap allows users to write and reuse simple scripts to deeply examine the activities of a running Linux system. These scripts can be designed to extract data, filter it, and summarize it quickly (and safely), enabling the diagnosis of complex performance (or even functional) problems.

The essential idea behind a SystemTap script is to name events, and to give them handlers. When SystemTap runs the script, SystemTap monitors for the event; once the event occurs, the Linux kernel then runs the handler as a quick sub-routine and then resumes its normal operation.

There are several kinds of events; entering or exiting a function, timer expiration, session termination, etc. A handler is a series of script language statements that specify the work to be done whenever the event occurs. This work normally includes extracting data from the event context, storing them into internal variables, and printing results.

Setting up SystemTap and its required kernel packages

To deploy SystemTap, SystemTap packages along with the corresponding set of -devel, -debuginfo and -debuginfo-common-arch packages for the kernel need to be installed. To use SystemTap on more than one kernel where a system has multiple kernels installed, install the -devel and -debuginfo packages for each of those kernel versions.

SystemTap needs information about the kernel in order to place instrumentation in it (probe it). This information, which allows SystemTap to generate the code for the instrumentation, is contained in the matching kernel-devel, kernel-debuginfo, and kernel-debuginfo-common-arch packages.

To install SystemTap packages:

[root@host1 ~]# cat /etc/centos-release
CentOS Linux release 7.9.2009 (Core)
[root@host1 ~]# uname -r
3.10.0-1160.el7.x86_64

[root@host1 ~]# yum install -y systemtap systemtap-runtime

[root@host1 ~]# rpm -qa | grep systemtap
systemtap-runtime-4.0-13.el7.x86_64
systemtap-devel-4.0-13.el7.x86_64
systemtap-4.0-13.el7.x86_64
systemtap-client-4.0-13.el7.x86_64

To install devel and debuginfo packages in CentOS(set to enabled=1):

[root@host1 ~]# vim /etc/yum.repos.d/CentOS-Debuginfo.repo
[base-debuginfo]
name=CentOS-7 - Debuginfo
baseurl=http://debuginfo.centos.org/7/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-Debug-7
enabled=1

[root@host1 ~]# yum install -y kernel-devel-$(uname -r) \
> kernel-debuginfo-$(uname -r) \
> kernel-debuginfo-common-$(uname -m)-$(uname -r)

[root@host1 ~]# rpm -qa | grep debuginfo
kernel-debuginfo-common-x86_64-3.10.0-1160.el7.x86_64
kernel-debuginfo-3.10.0-1160.el7.x86_64

The devel package was not installed successfully since it’s not available in the existing CentOS repository. It might be fixed by adding the expected repository for yum installation. The devel package is required by SystemTap otherwise the following error is seen.

[root@host1 ~]# stap -v -e 'probe vfs.read {printf("read performed\n"); exit()}'
Checking "/lib/modules/3.10.0-1160.el7.x86_64/build/.config" failed with error: No such file or directory
Incorrect version or missing kernel-devel package, use: yum install kernel-devel-3.10.0-1160.el7.x86_64

Here we just install it directly from the downloaded format as below.

[root@host1 ~]# wget https://rpmfind.net/linux/centos/7.9.2009/os/x86_64/Packages/kernel-devel-3.10.0-1160.el7.x86_64.rpm

[root@host1 ~]# rpm -ivh kernel-devel-3.10.0-1160.el7.x86_64.rpm

[root@host1 ~]# rpm -qa | grep kernel | grep  3.10.0
kernel-devel-3.10.0-1160.el7.x86_64
kernel-tools-libs-3.10.0-1160.el7.x86_64
kernel-tools-3.10.0-1160.el7.x86_64
kernel-debuginfo-common-x86_64-3.10.0-1160.el7.x86_64
kernel-debuginfo-3.10.0-1160.el7.x86_64
kernel-headers-3.10.0-1160.59.1.el7.x86_64
kernel-3.10.0-1160.el7.x86_64

To verify the SystemTap setup again:

[root@host1 ~]# stap -v -e 'probe vfs.read {printf("read performed\n"); exit()}'
Pass 1: parsed user script and 474 library scripts using 271960virt/69264res/3504shr/65852data kb, in 640usr/30sys/672real ms.
Pass 2: analyzed script: 1 probe, 1 function, 7 embeds, 0 globals using 439304virt/232180res/4884shr/233196data kb, in 2180usr/950sys/2977real ms.
Pass 3: translated to C into "/tmp/stap6kYO8U/stap_cc0f60b74db3020f09599659b9758c89_2771_src.c" using 439304virt/232436res/5140shr/233196data kb, in 10usr/50sys/67real ms.
Pass 4: compiled C into "stap_cc0f60b74db3020f09599659b9758c89_2771.ko" in 8040usr/1720sys/9477real ms.
Pass 5: starting run.
read performed
Pass 5: run completed in 30usr/90sys/442real ms.

SystemTap scripts

For the most part, SystemTap scripts are the foundation of each SystemTap session. SystemTap scripts instruct SystemTap on what type of information to collect, and what to do once that information is collected. SystemTap scripts are made up of two components: events and handlers. Once a SystemTap session is underway, SystemTap monitors the operating system for the specified events and executes the handlers as they occur.

SystemTap scripts allow insertion of the instrumentation code without recompilation of the code and allows more flexibility with regard to handlers. Events serve as the triggers for handlers to run; handlers can be specified to record specified data and print it in a certain manner.

SystemTap scripts use the .stp file extension and contains probes written in the following format:

probe event {statements}

Systemtap allows you to write functions to factor out code to be used by a number of probes. Thus, rather than repeatedly writing the same series of statements in multiple probes, you can just place the instructions in a function, as in:

function function_name(arguments){statements}
probe event {function_name(arguments)}

The statements in function_name are executed when the probe for event executes. The arguments are optional values passed into the function.

Running SystemTap Scripts

SystemTap scripts are run through the command stap. stap can run SystemTap scripts from the standard input or from a file.

We have seen how to run SystemTap from the standard input when we tried to verify the installation in previous section.

We can also run it from a file as below.

[root@host1 ~]# cat runfromfile.stp
probe vfs.read {
    printf("read performed\n");
    exit()
}

[root@host1 ~]# stap runfromfile.stp
read performed

At this point, we know what is SystemTap and how to deploy it. We will explore more meaningful usage of it in future posts.

Reference