简体   繁体   English

与内核 3.4 相比,为什么内核 4.1 上的软中断服务时间增加了?

[英]Why is time servicing softirqs increased on kernel 4.1 in compare with kernel 3.4?

We found a performance degradation issue when doing linux uplift from kernel 3.4 to 4.1 version.我们在从内核 3.4 到 4.1 版本进行 linux 升级时发现了一个性能下降问题。 Seems, it is caused by that the new kernel spends much more time to service softirqs.看来,是新内核花更多的时间来服务软中断造成的。

We did a test using traffic generator (IXIA) which generates GTP-C packets (over UDP) and a simple application which sends back the received packet.我们使用流量生成器 (IXIA) 进行了测试,该生成器生成 GTP-C 数据包(通过 UDP)和一个简单的应用程序,它将接收到的数据包发回。 It just interchanges source and destination IP addresses.它只是交换源和目标 IP 地址。 We got the following results using mpstat utility:我们使用 mpstat 实用程序得到以下结果:

  • kernel 3.4内核 3.4
# mpstat -P ALL 60 1
Linux 3.4.91-grsec (...)    12/22/17    _x86_64_    (20 CPU)

15:58:43     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
15:58:53     all    0.26    0.00    2.21    0.00    0.00    0.41    0.00    0.00   97.12
15:58:53       0    1.03    0.00    3.79    0.00    0.00   12.91    0.00    0.00   82.27
15:58:53       1    4.18    0.00   41.44    0.00    0.00    0.00    0.00    0.00   54.38
15:58:53       2    0.30    0.00    0.30    0.00    0.00    0.40    0.00    0.00   99.00
...

# mpstat -I ALL -P 0 60 1
Linux 3.4.91-grsec (...)    12/25/17    _x86_64_    (20 CPU)

10:53:16     CPU    intr/s
10:54:16       0      0.00

10:53:16     CPU        0/s        3/s        4/s        8/s        9/s       67/s       68/s       69/s       70/s       71/s       72/s       73/s       74/s       75/s       76/s       77/s       78/s       79/s       80/s       81/s       82/s       83/s       84/s      103/s      104/s      105/s      106/s      107/s      108/s      109/s      110/s      111/s      112/s      113/s      114/s      115/s      116/s      117/s      118/s      119/s      120/s      121/s      122/s      123/s      124/s      125/s      126/s      127/s      128/s      129/s      130/s      131/s      132/s      133/s      134/s      135/s      136/s      137/s      138/s      139/s      140/s      141/s      142/s      143/s      144/s      NMI/s      LOC/s      SPU/s      PMI/s      IWI/s      RTR/s      RES/s      CAL/s      TLB/s      TRM/s      THR/s      MCE/s      MCP/s      ERR/s      MIS/s
10:54:16       0       0.00       5.50       0.00       0.00       0.00       0.00      71.59       0.50       1.60       6.47       1.50      16.66       0.50       1.50       0.00      71.59       0.50       0.60       6.47       1.50      16.66       1.50       1.50     126.91   17445.01       9.93       1.87       1.28       1.25       1.27       4.82       0.83       1.37       1.58       7.45      14.38       3.45   18060.49       2.18       0.97       1.00       0.58       0.83       0.00      87.70   17503.72       7.62       1.22       0.75       0.67       0.98       2.98       0.53       1.87       1.58       1.50       6.62   18028.71       1.53       0.53       0.50       0.50       0.50       0.50       0.00       0.40     174.64       0.00       0.40       0.00       0.00     282.10       0.00       0.92       0.00       0.00       0.00       0.00       0.00       0.00

10:53:16     CPU       HI/s    TIMER/s   NET_TX/s   NET_RX/s    BLOCK/s BLOCK_IOPOLL/s  TASKLET/s    SCHED/s  HRTIMER/s      RCU/s
10:54:16       0       0.00     166.49       0.25   62813.75       0.00           0.00       0.00     298.20       0.87     130.00
...
  • kernel 4.1内核 4.1
# mpstat -P ALL 60 1
Linux 4.1.21 (...)  12/22/17    _x86_64_    (20 CPU)

16:19:12     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle
16:19:22     all    0.28    0.00    2.23    0.00    0.00    2.65    0.00    0.00   94.84
16:19:22       0    1.40    0.00    2.59    0.00    0.00   52.79    0.00    0.00   43.21
16:19:22       1    3.66    0.00   42.47    0.00    0.00    0.00    0.00    0.00   53.87
16:19:22       2    0.10    0.00    0.20    0.00    0.00    0.00    0.00    0.00   99.70
...

# mpstat -I ALL -P 0 60 1
Linux 4.1.21 (...)  12/25/17    _x86_64_    (20 CPU)

10:35:45     CPU    intr/s
10:36:45       0      0.00

10:35:45     CPU        0/s        3/s        4/s        8/s        9/s       31/s       32/s       33/s       34/s       35/s       36/s       37/s       38/s       39/s       41/s       42/s       43/s       44/s       45/s       46/s       47/s       48/s       49/s       69/s       70/s       71/s       72/s       73/s       74/s       75/s       76/s       77/s       78/s       79/s       80/s       81/s       82/s       83/s       84/s       85/s       86/s       87/s       88/s       89/s       91/s       92/s       93/s       94/s       95/s       96/s       97/s       98/s       99/s      100/s      101/s      102/s      103/s      104/s      105/s      106/s      107/s      108/s      109/s      110/s      111/s      NMI/s      LOC/s      SPU/s      PMI/s      IWI/s      RTR/s      RES/s      CAL/s      TLB/s      TRM/s      THR/s      MCE/s      MCP/s      ERR/s      MIS/s
10:36:45       0       0.00       5.43       0.00       0.00       0.00       0.00      81.55       0.50       0.60       2.50       0.50      12.63       0.50       2.00       0.00      81.55       0.50       0.60       2.50       0.50      12.63       0.50       1.50     134.87   16949.32       3.25       2.62       3.55       1.32   17720.62       7.03       7.35       3.58       1.63       6.73       3.53       4.48       5.33       1.37       1.42       0.83      10.58       0.52       0.00      90.33   16160.05       3.72       2.07       0.73       0.82       2.53   17669.83       7.23       2.17       1.50       0.50       1.57       0.50       0.50       0.50       0.50       0.50      10.50       0.50       0.00       0.05    1005.90       0.00       0.05       0.00       0.00       4.05       0.00       1.85       0.00       0.00       0.00       0.02       0.00       0.00

10:35:45     CPU       HI/s    TIMER/s   NET_TX/s   NET_RX/s    BLOCK/s BLOCK_IOPOLL/s   TASKLET/s    SCHED/s  HRTIMER/s      RCU/s
10:36:45       0       0.00    1000.05       2.93   56777.58       0.00            0.00       0.57     285.15       0.00    1313.05
...

The test conditions:测试条件:

  1. Traffic rate: 50000 fps流量速率:50000 fps
  2. The application is bound to the CPU 1 using taskset command应用程序使用 taskset 命令绑定到 CPU 1
  3. All the interrupts (which are related to ethernet interfaces) are bound to CPU 0.所有中断(与以太网接口相关)都绑定到 CPU 0。

As we can see "%soft" is increased significantly on the CPU 0: 12.91% -> 52.79%.正如我们所看到的,“%soft”在 CPU 0 上显着增加:12.91% -> 52.79%。 However NET_RX/s is decreased: 62813.75 -> 56777.58.但是 NET_RX/s 减少了:62813.75 -> 56777.58。

Also we tried to profile the CPU 0 using perf utility.我们还尝试使用 perf 实用程序分析 CPU 0。 Unfortunately, no clue is found.不幸的是,没有找到任何线索。

  • kernel 3.4内核 3.4
# perf stat -C 0 -e irq:softirq_entry,irq:softirq_exit,irq:softirq_raise -a sleep 60

 Performance counter stats for 'sleep 60':

           3690739 irq:softirq_entry                                            [100.00%]
           3691328 irq:softirq_exit                                             [100.00%]
           4366035 irq:softirq_raise                                           

      60.020019821 seconds time elapsed
# perf stat -C 0 -d -a sleep 60

 Performance counter stats for 'sleep 60':

      59978.442551 task-clock                #    0.999 CPUs utilized           [100.00%]
            138638 context-switches          #    0.002 M/sec                   [100.00%]
              4206 CPU-migrations            #    0.070 K/sec                   [100.00%]
             91404 page-faults               #    0.002 M/sec                  
       49824470562 cycles                    #    0.831 GHz                     [49.34%]
       32279104677 stalled-cycles-frontend   #   64.79% frontend cycles idle    [48.20%]
    stalled-cycles-backend  
       41765280058 instructions              #    0.84  insns per cycle        
                                             #    0.77  stalled cycles per insn [62.35%]
        6939461584 branches                  #  115.699 M/sec                   [62.90%]
          69255081 branch-misses             #    1.00% of all branches         [61.90%]
       13778063320 L1-dcache-loads           #  229.717 M/sec                   [63.72%]
         757751494 L1-dcache-load-misses     #    5.50% of all L1-dcache hits   [63.85%]
         153774796 LLC-loads                 #    2.564 M/sec                   [50.09%]
    LLC-load-misses         

      60.065977609 seconds time elapsed
  • kernel 4.1内核 4.1
# perf stat -C 0 -e irq:softirq_entry,irq:softirq_exit,irq:softirq_raise -a sleep 60

 Performance counter stats for 'CPU(s) 0':

           3540101      irq:softirq_entry                                             (100.00%)
           3540380      irq:softirq_exit                                              (100.00%)
           4365512      irq:softirq_raise                                           

      60.061923392 seconds time elapsed

# perf stat -C 0 -d -a sleep 60

 Performance counter stats for 'CPU(s) 0':

      60105.618981      task-clock (msec)         #    1.001 CPUs utilized            (100.00%)
            112358      context-switches          #    0.002 M/sec                    (100.00%)
              3042      cpu-migrations            #    0.051 K/sec                    (100.00%)
             23077      page-faults               #    0.384 K/sec                  
       69596616908      cycles                    #    1.158 GHz                      (44.44%)
       47063269010      stalled-cycles-frontend   #   67.62% frontend cycles idle     (44.43%)
         stalled-cycles-backend   
       47882140206      instructions              #    0.69  insns per cycle        
                                                  #    0.98  stalled cycles per insn  (55.54%)
        8644786229      branches                  #  143.827 M/sec                    (55.54%)
          82066359      branch-misses             #    0.95% of all branches          (55.55%)
       15150062571      L1-dcache-loads           #  252.057 M/sec                    (44.45%)
         891694267      L1-dcache-load-misses     #    5.89% of all L1-dcache hits    (22.23%)
         192155955      LLC-loads                 #    3.197 M/sec                    (22.23%)
            148469      LLC-load-misses           #    0.08% of all LL-cache hits     (33.33%)

      60.062744860 seconds time elapsed

Maybe, someone faced with a similar issue.也许,有人面临类似的问题。 Any advices and suggestions will be greatly appreciated.任何意见和建议将不胜感激。

Had a similar issue when upgrading from 3.16 to 4.9 kernel (ie Debian Jessie to Debian Stretch).从 3.16 升级到 4.9 内核(即 Debian Jessie 到 Debian Stretch)时遇到了类似的问题。

Network performance for UDP dropped significantly. UDP 的网络性能显着下降。

We did not have any flash of genius, which helped us find the culprit for this.我们没有任何天才的闪光,这帮助我们找到了罪魁祸首。

However, we did some tuning and came pretty close to the previous performance:但是,我们进行了一些调整,并且非常接近之前的性能:

1) UsingRSS . 1)使用RSS Assign the CPU cores statically to the NICs receive queues.将 CPU 内核静态分配给 NIC 接收队列。 Stop the irqbalance service.停止 irqbalance 服务。 RPS and RFS did not improve the performance any further over what RSS did in our case, but it might be useful for you, depending on your hardware. RPS 和 RFS 并没有比 RSS 在我们的情况下进一步提高性能,但它可能对您有用,具体取决于您的硬件。

2) Reduce the interval by which the scheduler becomes active. 2) 减少调度器激活的时间间隔。 I simply used the parameters from this article .我只是使用了本文中的参数。

Some further literature 一些进一步的文献

Possibly the cause is mitigation of spectre and meltdown vulnerabilities.原因可能是减轻了幽灵和崩溃漏洞。 Try to pass "nospectrev2 nopti nopcid" as boot parameters for your kernel.尝试将“nospectrev2 nopti nopcid”作为内核的引导参数传递。 Be ware, that this changes are potential security hole.请注意,此更改是潜在的安全漏洞。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM