关于系统调用的perf报告

Question

I have the following output for perf report (about malloc) for Process A,B : 我为进程A，B提供了针对perf报告（关于malloc）的以下输出：

recorded by : perf record -e cycles:u 记录：perf记录-e周期：你

Process A: 流程A：

0.00%       1833448  Test-Recv  libc-2.17.so           [.] malloc              
0.00%       1833385  Test-Recv  [kernel.kallsyms]      [k] system_call         
0.00%        916588  Test-Recv  libc-2.17.so           [.] _int_malloc

and also the following for Process B: 以及流程B的以下内容：

24.90%   10855848444  test.exe  libc-2.17.so   [.] _int_malloc
15.78%    6881565672  test.exe  libc-2.17.so   [.] _int_free
 7.48%    3261672221  test.exe  libc-2.17.so   [.] malloc
 4.66%    2030332370  test.exe  libc-2.17.so   [.] systrim.isra.2
 2.43%    1061251259  test.exe  libc-2.17.so   [.] free
 2.12%     925588492  test.exe  test.exe       [.] main

Both of them do some malloc in source code 他们都在源代码中做了一些malloc

May I assume that in Process A case , the malloc do occur system call , But in Process B case , there is no system call happened , since in Process B perf report , there is no [k] system_call at all !!! 我可以假设在进程A的情况下，malloc确实发生了系统调用，但在进程B的情况下，没有发生系统调用，因为在进程B的性能报告中，根本没有[k] system_call！

Answer 1

You don't get all the functions called by some program by using sampling, you will get some of the functions called, the ones where the event is being sampled the most, for "cycles:u" you'll get the "hottest" functions in user space (no kernel functions). 你没有得到一些程序通过使用采样调用的所有函数，你会得到一些被调用的函数，那些事件被采样最多的函数，对于“周期：你”，你会得到“最热”的用户空间中的函数（没有内核函数）。

Consider using tracing instead of sampling, something like: 'perf trace workload'. 考虑使用跟踪而不是采样，例如：'perf trace workload'。 Consider using backtraces with it, for instance, looking at the backtraces for the 'brk' syscall that 'ls' does we can get: 考虑使用它的回溯，例如，查看'brk'系统调用的回溯，我们可以得到'ls'：



# perf trace -e brk --call-graph dwarf ls
   0.933 (0.009 ms): ls brk(brk: 0x5577c4683000) = 0x5577c4683000
                     __brk (/usr/lib64/libc-2.26.so)
                     __GI___sbrk (inlined)
                     __GI___default_morecore (inlined)
                     sysmalloc (/usr/lib64/libc-2.26.so)
                     _int_malloc (/usr/lib64/libc-2.26.so)
                     tcache_init.part.5 (/usr/lib64/libc-2.26.so)
                     __GI___libc_malloc (inlined)
                     __GI___strdup (inlined)
                     [0xffff80aa65b9ae49] (/usr/lib64/libselinux.so.1)
                     [0xffff80aa65b9af0e] (/usr/lib64/libselinux.so.1)
                     call_init.part.0 (/usr/lib64/ld-2.26.so)
                     _dl_init (/usr/lib64/ld-2.26.so)
                     _dl_start_user (/usr/lib64/ld-2.26.so)

That shows that the syscall was called in this case, in response to a strdup() that called malloc() that ended up asking the kernel core more memory via the 'brk' call. 这表明在这种情况下调用了系统调用，以响应调用malloc（）的strdup（），最终通过'brk'调用向内核核心询问更多内存。

Play with 'perf trace' some more and you'll discover statistics like the ones provided by 'strace', for instance, how many times a program called brk and other syscalls. 再玩'perf trace'，你会发现像'strace'提供的统计数据，例如，一个程序叫做brk和其他系统调用的次数。

Answer 2

Yeah, seems reasonable. 是的，似乎合情合理。 Probably process B got some memory from the kernel once, then was able to satisfy all its allocations from the free-list. 可能进程B从内核获取了一些内存，然后能够满足自由列表中的所有分配。 ie the free list never got big enough (or was too fragmented) for glibc's malloc implementation to decide to give any of the pages back to the kernel. 即自由列表从来没有足够大（或太碎片）glibc的malloc实现决定将任何页面返回到内核。

It all comes down to allocation / deallocation patterns, and sizes of mappings. 这一切都归结为分配/解除分配模式和映射的大小。 For large malloc requests, glibc uses mmap(MAP_ANONYMOUS) directly, so it can munmap it on free . 对于大型malloc请求，glibc直接使用mmap(MAP_ANONYMOUS) ，因此可以free对其进行munmap 。

关于系统调用的perf报告

问题描述

2 个解决方案

解决方案1
2 2018-04-17 19:42:40

解决方案2
1 已采纳 2018-04-16 08:23:50

关于系统调用的perf报告

问题描述

2 个解决方案

解决方案1 2 2018-04-17 19:42:40

解决方案2 1 已采纳 2018-04-16 08:23:50

解决方案1
2 2018-04-17 19:42:40

解决方案2
1 已采纳 2018-04-16 08:23:50