不使用-S选项对ls -alh的输出值进行排序

Question

I have a script that recursively goes through directories and appends the result of running ls -alh --block-size=KB | grep ^\\- 我有一个递归遍历目录的脚本，并附加了运行ls -alh --block-size=KB | grep ^\\- ls -alh --block-size=KB | grep ^\\- to a file. ls -alh --block-size=KB | grep ^\\-文件。 I then need to sort the resulting file by decreasing file size in the same way that using the -S option would if it was used at the point where ls was called. 然后，我需要通过减小文件大小来对结果文件进行排序，就像在调用ls时使用-S选项一样。

Answer 1

The many issues of trying to parse ls are covered well in Why you shouldn't parse the output of ls and Fixing Unix/Linux/POSIX Filenames for an idea of what others have tried before you. 为何不应该解析ls的输出以及如何修复Unix / Linux / POSIX文件名，都很好地介绍了尝试解析ls的许多问题，以了解其他人在您之前尝试过的内容。

Some additional reasons that your approach will not work reliably: 您的方法无法可靠运行的一些其他原因：

If you recurse over a device boundary, some versions of ls may add a column to show the new device ID and throw off your sorting and parsing; 如果递归到设备边界，则某些版本的ls可能会添加一列以显示新的设备ID并取消您的排序和解析。
You are using sed to remove the kB / mB / gB magnitude of the output of ls -h . 您正在使用sed删除ls -h输出的kB / mB / gB幅度。 That will sort a 2 byte file, 2 kilobyte file and 2 megabyte file all together as the same size. 这样会将2字节的文件，2千字节的文件和2兆字节的文件全部排序为相同的大小。
The output of ls changes when you pipe it or display at the terminal also changing the logic of the parsing / sorting. 当您通过管道传输或在终端上显示时， ls的输出也会更改，这也会更改解析/排序的逻辑。

The solution is to use a glob and sort based on an added column to the output of ls . 解决方案是使用glob并根据ls的输出添加的列进行排序。

We can use dd to create a list of test files of some known sizes: 我们可以使用dd创建一些已知大小的测试文件的列表：

dd if=/dev/zero of=A  bs=2  count=1
dd if=/dev/zero of=B  bs=1024  count=2
dd if=/dev/zero of=C  bs=1024  count=3
dd if=/dev/zero of=D  bs=1024  count=150
dd if=/dev/zero of=E  bs=1024  count=2000

Resulting in: 导致：

$ ls -lh *
-rw-r--r--  1 andrew  wheel     2B Jan  8 20:52 A
-rw-r--r--  1 andrew  wheel   2.0K Jan  8 20:52 B
-rw-r--r--  1 andrew  wheel   3.0K Jan  8 20:52 C
-rw-r--r--  1 andrew  wheel   150K Jan  8 20:52 D
-rw-r--r--  1 andrew  wheel   2.0M Jan  8 20:52 E

If you sort the output of ls by the -S switch: 如果通过-S开关对ls的输出进行排序：

$ ls -lhS *
-rw-r--r--  1 andrew  wheel   2.0M Jan  8 20:52 E
-rw-r--r--  1 andrew  wheel   150K Jan  8 20:52 D
-rw-r--r--  1 andrew  wheel   3.0K Jan  8 20:52 C
-rw-r--r--  1 andrew  wheel   2.0K Jan  8 20:52 B
-rw-r--r--  1 andrew  wheel     2B Jan  8 20:52 A

You approach would remove the M K or B in column five and then sort on that. 您的方法是删除第五列中的M K或B ，然后对其进行排序。 That would result in A, B and E sorting together. 这将导致A，B和E一起排序。

(It is possible to crudely sort the output of ls like so: （可以像这样对ls的输出进行粗略排序：

$ ls -al | grep ^\- | sort -nrk 5
-rw-r--r--   1 andrew  wheel  2048000 Jan  8 20:52 E
-rw-r--r--   1 andrew  wheel   153600 Jan  8 20:52 D
-rw-r--r--   1 andrew  wheel     3072 Jan  8 20:52 C
-rw-r--r--   1 andrew  wheel     2048 Jan  8 20:52 B
-rw-r--r--   1 andrew  wheel        2 Jan  8 20:52 A

but that does not produce the output of -h that you have...) 但这不会产生-h的输出...）

The correct way is to do this is to use a Decorate / Sort / Undecorate pattern with a glob. 正确的方法是使用带装饰的装饰/排序/非装饰图案。

for fn in *; do
    [ -f "$fn" ] || continue
    c1=$(($(wc -c < "$fn")))
    c2=$(ls -alh "$fn")
    printf "%s\t%s\n" "$c1" "$c2"
done | sort -nrk 1 | cut -f 2

Result: 结果：

-rw-r--r--  1 andrew  wheel   2.0M Jan  8 20:52 E
-rw-r--r--  1 andrew  wheel   150K Jan  8 20:52 D
-rw-r--r--  1 andrew  wheel   3.0K Jan  8 20:52 C
-rw-r--r--  1 andrew  wheel   2.0K Jan  8 20:52 B
-rw-r--r--  1 andrew  wheel     2B Jan  8 20:52 A

Which is the same as using ls -lhS 与使用ls -lhS相同

If you are recursing a file tree and writing to a file, the general methodology is the same. 如果要递归文件树并写入文件，则一般方法是相同的。

Answer 2

My solution which is good enough for my purposes, though the accepted answer is much better: 尽管可以接受的答案要好得多，但我的解决方案足以满足我的目的：

sed 's/kB//' files.tmp > files1.tmp #remove first instance of "kB" from each line
sed 's/ \+/ /g' files1.tmp > files2.tmp #replace all multiple spaces with single space
sort -k 5n,5 files2.tmp | tac > files3.tmp #sort by numeric file size and reverse

This only works due to giving the --block-size=KB option to ls . 这仅由于将--block-size=KB选项赋予ls而--block-size=KB 。

不使用-S选项对ls -alh的输出值进行排序

问题描述

2 个解决方案

解决方案1
1 已采纳 2017-01-09 05:08:40

解决方案2
0 2017-01-09 11:23:02

不使用-S选项对ls -alh的输出值进行排序

问题描述

2 个解决方案

解决方案1 1 已采纳 2017-01-09 05:08:40

解决方案2 0 2017-01-09 11:23:02

解决方案1
1 已采纳 2017-01-09 05:08:40

解决方案2
0 2017-01-09 11:23:02