简体   繁体   English

调整的adm更改,吞吐量从4k下降到9条消息

[英]throughput fall from 4k to 9 messages with tuned-adm changes

I have a network client and server application. 我有一个网络客户端和服务器应用程序。 The dataflow is such that the client sends a message to the server and the server responds with an acknowledgment. 数据流使得客户端向服务器发送消息,并且服务器以确认响应。 Only on the receipt of the acknowledgment, client seconds the next message. 仅在收到确认后,客户端才会秒发下一条消息。

The client application, written in C++, have 3 threads, namely network thread (responsible for sending messages via socket), main thread( responsible for making a request message) and a timer thread (fires every second). 用C ++编写的客户端应用程序具有3个线程,即网络线程(负责通过套接字发送消息),主线程(负责发出请求消息)和计时器线程(每秒触发)。

The server application have 2 threads, main thread and the network thread. 服务器应用程序具有2个线程,主线程和网络线程。

I run RHEL 6.3, 2.6.32-279 kernel. 我运行RHEL 6.3、2.6.32-279内核。

Configuration 1 配置1

  1. tuned-adm profile latency-performance 调整的adm配置文件延迟性能
  2. All Client's threads on same CPU Core id 所有客户端线程在同一CPU核心ID上
  3. All Server's threads on same CPU Core id, but a different Core Id from Client's thread 所有服务器的线程都在相同的CPU核心ID上,但与客户线程的核心ID不同
  4. Client and server running on same machine 客户端和服务器在同一台计算机上运行

Throughput: 4500 messages per second 吞吐量:每秒4500条消息

Configuration 2 配置2

  1. tuned-adm profile throughput-performance tuned-adm配置文件吞吐性能
  2. All Client's threads on same CPU Core id 所有客户端线程在同一CPU核心ID上
  3. All Server's threads on same CPU Core id, but a different Core Id from Client's thread 所有服务器的线程都在相同的CPU核心ID上,但与客户线程的核心ID不同
  4. Client and server running on same machine 客户端和服务器在同一台计算机上运行

Throughput: 9-15 messages per second 吞吐量:每秒9-15条消息

Configuration 3 配置3

  1. tuned-adm profile throughput-performance tuned-adm配置文件吞吐性能
  2. All Client's threads on different CPU Core id 不同CPU核心ID上的所有客户端线程
  3. All Server's threads on different CPU Core id, and different Core Id from Client's thread 所有服务器线程在不同的CPU核心ID上以及客户端的线程在不同的核心ID上
  4. Client and server running on same machine 客户端和服务器在同一台计算机上运行

Throughput: 1100 messages per second 吞吐量:每秒1100条消息

The machine has negligible load. 机器的负载可以忽略不计。 Can someone explain the drop from 4k to 9 messages per second when profile was switched from latency-performance to throughput-performance. 当配置文件从延迟性能更改为吞吐量性能时,有人可以解释一下每秒4k到9条消息的下降。

Here's the basic schedule of differences between the RHEL tuned-adm profiles: 这是RHEL tuned-adm配置文件之间差异的基本时间表:

Latency performance shifts the I/O elevator to deadline and changes the CPU governor to the "performance" setting. 延迟性能将I / O升降机移至截止日期,并将CPU调速器更改为“性能”设置。

Throughput performance is optimized for network and disk performance. 吞吐量性能针对网络和磁盘性能进行了优化。 See the specifics below... 请参阅下面的详细信息...

Your workload appears to be latency sensitive. 您的工作负载似乎对延迟敏感。

在此处输入图片说明

Here's the setup for throughput-performance w/ comments. 这是带有注释的throughput-performance的设置。 latency-performance does not modify any of these. latency-performance不会修改任何这些。

# ktune sysctl settings for rhel6 servers, maximizing i/o throughput
#
# Minimal preemption granularity for CPU-bound tasks:
# (default: 1 msec#  (1 + ilog(ncpus)), units: nanoseconds)
kernel.sched_min_granularity_ns = 10000000

# SCHED_OTHER wake-up granularity.
# (default: 1 msec#  (1 + ilog(ncpus)), units: nanoseconds)
#
# This option delays the preemption effects of decoupled workloads
# and reduces their over-scheduling. Synchronous workloads will still
# have immediate wakeup/sleep latencies.
kernel.sched_wakeup_granularity_ns = 15000000

# If a workload mostly uses anonymous memory and it hits this limit, the entire
# working set is buffered for I/O, and any more write buffering would require
# swapping, so it's time to throttle writes until I/O can catch up.  Workloads
# that mostly use file mappings may be able to use even higher values.
#
vm.dirty_ratio = 40

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM