如何在docker运行Stress-ng的情况下使用perf工具？

Question

我正在使用来自https://hub.docker.com/r/polinux/stress-ng/dockerfile的Stress-ng docker映像来加载我的系统。我想使用性能工具来监视指标。

perf stat -- stress-ng --cpu 2 --timeout 10进行10秒钟的ng运算并返回性能指标。我尝试使用perf stat -- docker run -ti --rm polinux/stress-ng --cpu 2 --timeout 10对docker映像进行相同的操作。这会返回指标，但不会返回Stress-ng的指标。

我在Stress-ng上使用'perf stat'时得到的输出：

Performance counter stats for 'stress-ng --cpu 2 --timeout 10':

  19975.863889      task-clock (msec)         #    1.992 CPUs utilized          
         2,057      context-switches          #    0.103 K/sec                  
             7      cpu-migrations            #    0.000 K/sec                  
         8,783      page-faults               #    0.440 K/sec                  
52,568,560,651      cycles                    #    2.632 GHz                    
89,424,109,426      instructions              #    1.70  insn per cycle         
17,496,929,762      branches                  #  875.904 M/sec                  
    97,910,697      branch-misses             #    0.56% of all branches        

  10.025825765 seconds time elapsed

我在docker映像上使用perf工具时得到的输出：

Performance counter stats for 'docker run -ti --rm polinux/stress-ng --cpu 2 --timeout 10':

    154.613610      task-clock (msec)         #    0.014 CPUs utilized          
           858      context-switches          #    0.006 M/sec                  
           113      cpu-migrations            #    0.731 K/sec                  
         4,989      page-faults               #    0.032 M/sec                  
   252,242,504      cycles                    #    1.631 GHz                    
   375,927,959      instructions              #    1.49  insn per cycle         
    84,847,109      branches                  #  548.769 M/sec                  
     1,127,634      branch-misses             #    1.33% of all branches        

  10.704752134 seconds time elapsed

有人可以帮助我如何在使用docker运行时获取Stress-ng的度量标准吗？

Answer 1

通过@osgx发表评论，

如所提到的here，默认情况下，perf stat命令将不仅监视要监视的进程的所有线程，还监视其子进程和线程。

这种情况下的问题是，通过运行perf stat并监视docker run stress-ng命令，您没有监视实际的stress-ng进程。重要的是要注意，作为容器一部分运行的进程实际上将不会由docker客户端启动，而是由docker-containerd-shim进程（这是dockerd进程的孙进程）启动。

[如果您运行docker命令以在容器内运行stress-ng并观察进程树，则将变得显而易见。

docker run -ti --name=stress-ng --rm polinux/stress-ng --cpu 2 --timeout 100

ps -elf | grep docker

0 S ubuntu    26379 114001  0  80   0 - 119787 futex_ 12:33 pts/3   00:00:00 docker run -ti --name=stress-ng --rm polinux/stress-ng --cpu 2 --timeout 10000
4 S root      26431 118477  0  80   0 -  2227 -      12:33 ?        00:00:00 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/72a8c2787390669ff4eeae6f343ab4f9f60434f39aae66b1a778e78b7e5e45d8 -address /var/run/docker/containerd/docker-containerd.sock -containerd-binary /usr/bin/docker-containerd -runtime-root /var/run/docker/runtime-runc
0 S ubuntu    26610  26592  0  80   0 -  3236 pipe_w 12:34 pts/6    00:00:00 grep --color=auto docker
4 S root     118453      1  3  80   0 - 283916 -     May02 ?        01:01:57 /usr/bin/dockerd -H fd://
4 S root     118477 118453  4  80   0 - 457853 -     May02 ?        01:14:36 docker-containerd --config /var/run/docker/containerd/containerd.toml

----------------------------------------------------------------------

ps -elf | grep stress-ng

0 S ubuntu    26379 114001  0  80   0 - 119787 futex_ 12:33 pts/3   00:00:00 docker run -ti --name=stress-ng --rm polinux/stress-ng --cpu 2 --timeout 10000
4 S root      26455  26431  0  80   0 - 16621 -      12:33 pts/0    00:00:00 /usr/bin/stress-ng --cpu 2 --timeout 10000
1 R root      26517  26455 99  80   0 - 16781 -      12:33 pts/0    00:01:08 /usr/bin/stress-ng --cpu 2 --timeout 10000
1 R root      26518  26455 99  80   0 - 16781 -      12:33 pts/0    00:01:08 /usr/bin/stress-ng --cpu 2 --timeout 10000
0 S ubuntu    26645  26592  0  80   0 -  3236 pipe_w 12:35 pts/6    00:00:00 grep --color=auto stress-ng

第一个stress-ng进程的PPID是26431，这不是docker run命令，实际上是docker-containerd-shim进程。监视docker run命令将永远不会反映正确的值，因为docker客户端与启动stress-ng命令的过程完全分离。

解决此问题的一种方法是将perf stat命令附加到由docker运行时启动的Stress-ng进程的PID上。

例如，与上述情况一样，一旦启动docker run命令，您就可以立即开始执行此操作-

perf stat -p 26455,26517,26518

 Performance counter stats for process id '26455,26517,26518':

     148171.516145      task-clock (msec)         #    1.939 CPUs utilized          
                49      context-switches          #    0.000 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
                67      page-faults               #    0.000 K/sec

您可以稍微增加--timeout以便命令运行更长的时间，因为您现在是在启动perf stat之后启动stress-ng的。

另一种方法是在docker容器中运行perf stat，但是为此，您必须开始向容器提供privileges，因为默认情况下，perf_event_open系统调用已列入docker的黑名单中。您可以阅读此答案here。

如何在docker运行Stress-ng的情况下使用perf工具？

问题描述投票：0回答：1

1个回答

最新问题

如何在docker运行Stress-ng的情况下使用perf工具？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1