简体繁体 English

linux 2.6.43，ext3、10K RPM SAS磁盘，在不同文件上的2个连续写入（direct io），就像随机写入一样

[英]linux 2.6.43, ext3, 10K RPM SAS disk, 2 sequential write(direct io) on different file acting like random write

原文 2015-07-16 04:26:34 6 1 linux/ disk/ ext3

I recently stall on this one problem: 我最近在这个问题上停滞不前：

"2 sequential write(direct io 4KB alignemnt block) on different file acting like random write, which yield poor write performance in 10K RPM SAS disk". “在不同文件上执行2次连续写入（直接IO 4KB对齐块），就像随机写入一样，这在10K RPM SAS磁盘中产生较差的写入性能”。

The thing confuse me most: I got batch of server, all equip with same kind of disk (raid 1 with 2 300GB 10K RPM disk), but response different. 事情最让我感到困惑：我得到了一批服务器，都配备了相同类型的磁盘（带有2 300GB 10K RPM磁盘的RAID 1），但是响应却不同。

several servers seams ok with this kind of write pattern, disk happy accepted up to 50+MB/s; 几种服务器可以通过这种写入模式进行接缝，磁盘的接受速度最高可达50 + MB / s； (same kernel version, same filesystem, with different lib (libc 2.4)) （相同的内核版本，相同的文件系统，但具有不同的lib（libc 2.4））

others not so much, 100 op/s seams reach the limit of underlying disk, which confirm the random write performance of disk; 其他情况不是很多，100 op / s接缝达到了底层磁盘的极限，这确定了磁盘的随机写入性能； ((same kernel version, same filesystem, with different lib (libc 2.12))) （（相同的内核版本，相同的文件系统，具有不同的lib（libc 2.12）））

[NOTE: I check the "pwrite" code of different libc, which tell nothing but simple "syscall"] [注意：我检查了不同libc的“ pwrite”代码，该代码只告诉了简单的“ syscall”。

I have managed to rule out the possibly: 1. software bug in my own program; 我设法排除了以下可能性：1.我自己的程序中存在软件错误； by a simple deamon(compile with no dynamic link), do sequcetial direct io write; 通过一个简单的守护进程（不带动态链接的编译），进行顺序直接io写入； 2. disk problem; 2.磁盘问题； switch 2 different version of linux system on one test machine, which perform well on my direct io write pattern, and a couple of day after switch to old lib version, the bad random write; 在一台测试计算机上切换2个不同版本的linux系统，在我的直接io写入模式下运行良好，并且切换到旧的lib版本几天后，随机写入错误；

I try to compare: 我尝试比较一下：

/sys/block/sda/queue/*, which may different in both way; / sys / block / sda / queue / *，两者可能有所不同；

filefrag show nothing but two different file interleaved sequenctial grow physical block id; filefrag除了两个不同的文件交错顺序增长物理块ID之外什么都不显示；

there must be some kind of write strategy lead to this problem, but i don't know where to start: 必须有某种写策略导致此问题，但我不知道从哪里开始：

different kernel setting ?, may be related to how ext3 allocate disk block ? 不同的内核设置？，可能与ext3如何分配磁盘块有关？

raid cache(write back) or disk cache write strategy? raid缓存（写回）还是磁盘缓存写策略？

or underlying disk strategy to mapping logical block into real physical block ? 还是将逻辑块映射到实际物理块的基础磁盘策略？

really appreciate 万分感激

1 个解决方案

THE ANS IS: it's because of /sys/block/sda/queue/schedule setting: ANS是：这是由于/ sys / block / sda / queue / schedule设置：

MACHINE A: display schedule: cfq, but undlying, it's deadline; 机器A：显示时间表：cfq，但不准确，是截止日期；

MACHINE B: the schedule is consistent with cfq; 机器B：时间表与cfq一致； //=> SINCE my server is db svr, deadline is my best option; // =>因为我的服务器是db svr，所以截止日期是我最好的选择；