简体   繁体   English

不使用-S选项对ls -alh的输出值进行排序

[英]Sort values of output of ls -alh without using -S option

I have a script that recursively goes through directories and appends the result of running ls -alh --block-size=KB | grep ^\\- 我有一个递归遍历目录的脚本,并附加了运行ls -alh --block-size=KB | grep ^\\- ls -alh --block-size=KB | grep ^\\- to a file. ls -alh --block-size=KB | grep ^\\-文件。 I then need to sort the resulting file by decreasing file size in the same way that using the -S option would if it was used at the point where ls was called. 然后,我需要通过减小文件大小来对结果文件进行排序,就像在调用ls时使用-S选项一样。

The many issues of trying to parse ls are covered well in Why you shouldn't parse the output of ls and Fixing Unix/Linux/POSIX Filenames for an idea of what others have tried before you. 为何不应该解析ls的输出以及如何修复Unix / Linux / POSIX文件名 ,都很好地介绍了尝试解析ls的许多问题,以了解其他人在您之前尝试过的内容。

Some additional reasons that your approach will not work reliably: 您的方法无法可靠运行的一些其他原因:

  1. If you recurse over a device boundary, some versions of ls may add a column to show the new device ID and throw off your sorting and parsing; 如果递归到设备边界,则某些版本的ls可能会添加一列以显示新的设备ID并取消您的排序和解析。
  2. You are using sed to remove the kB / mB / gB magnitude of the output of ls -h . 您正在使用sed删除ls -h输出的kB / mB / gB幅度。 That will sort a 2 byte file, 2 kilobyte file and 2 megabyte file all together as the same size. 这样会将2字节的文件,2千字节的文件和2兆字节的文件全部排序为相同的大小。
  3. The output of ls changes when you pipe it or display at the terminal also changing the logic of the parsing / sorting. 当您通过管道传输或在终端上显示时, ls的输出也会更改,这也会更改解析/排序的逻辑。

The solution is to use a glob and sort based on an added column to the output of ls . 解决方案是使用glob并根据ls的输出添加的列进行排序。

We can use dd to create a list of test files of some known sizes: 我们可以使用dd创建一些已知大小的测试文件的列表:

dd if=/dev/zero of=A  bs=2  count=1
dd if=/dev/zero of=B  bs=1024  count=2
dd if=/dev/zero of=C  bs=1024  count=3
dd if=/dev/zero of=D  bs=1024  count=150
dd if=/dev/zero of=E  bs=1024  count=2000

Resulting in: 导致:

$ ls -lh *
-rw-r--r--  1 andrew  wheel     2B Jan  8 20:52 A
-rw-r--r--  1 andrew  wheel   2.0K Jan  8 20:52 B
-rw-r--r--  1 andrew  wheel   3.0K Jan  8 20:52 C
-rw-r--r--  1 andrew  wheel   150K Jan  8 20:52 D
-rw-r--r--  1 andrew  wheel   2.0M Jan  8 20:52 E

If you sort the output of ls by the -S switch: 如果通过-S开关对ls的输出进行排序:

$ ls -lhS *
-rw-r--r--  1 andrew  wheel   2.0M Jan  8 20:52 E
-rw-r--r--  1 andrew  wheel   150K Jan  8 20:52 D
-rw-r--r--  1 andrew  wheel   3.0K Jan  8 20:52 C
-rw-r--r--  1 andrew  wheel   2.0K Jan  8 20:52 B
-rw-r--r--  1 andrew  wheel     2B Jan  8 20:52 A

You approach would remove the M K or B in column five and then sort on that. 您的方法是删除第五列中的M KB ,然后对其进行排序。 That would result in A, B and E sorting together. 这将导致A,B和E一起排序。


(It is possible to crudely sort the output of ls like so: 可以像这样对ls的输出进行粗略排序:

$ ls -al | grep ^\- | sort -nrk 5
-rw-r--r--   1 andrew  wheel  2048000 Jan  8 20:52 E
-rw-r--r--   1 andrew  wheel   153600 Jan  8 20:52 D
-rw-r--r--   1 andrew  wheel     3072 Jan  8 20:52 C
-rw-r--r--   1 andrew  wheel     2048 Jan  8 20:52 B
-rw-r--r--   1 andrew  wheel        2 Jan  8 20:52 A

but that does not produce the output of -h that you have...) 但这不会产生-h的输出...)


The correct way is to do this is to use a Decorate / Sort / Undecorate pattern with a glob. 正确的方法是使用带装饰的装饰/排序/装饰图案

for fn in *; do
    [ -f "$fn" ] || continue
    c1=$(($(wc -c < "$fn")))
    c2=$(ls -alh "$fn")
    printf "%s\t%s\n" "$c1" "$c2"
done | sort -nrk 1 | cut -f 2   

Result: 结果:

-rw-r--r--  1 andrew  wheel   2.0M Jan  8 20:52 E
-rw-r--r--  1 andrew  wheel   150K Jan  8 20:52 D
-rw-r--r--  1 andrew  wheel   3.0K Jan  8 20:52 C
-rw-r--r--  1 andrew  wheel   2.0K Jan  8 20:52 B
-rw-r--r--  1 andrew  wheel     2B Jan  8 20:52 A

Which is the same as using ls -lhS 与使用ls -lhS相同

If you are recursing a file tree and writing to a file, the general methodology is the same. 如果要递归文件树并写入文件,则一般方法是相同的。

My solution which is good enough for my purposes, though the accepted answer is much better: 尽管可以接受的答案要好得多,但我的解决方案足以满足我的目的:

sed 's/kB//' files.tmp > files1.tmp #remove first instance of "kB" from each line
sed 's/ \+/ /g' files1.tmp > files2.tmp #replace all multiple spaces with single space
sort -k 5n,5 files2.tmp | tac > files3.tmp #sort by numeric file size and reverse

This only works due to giving the --block-size=KB option to ls . 这仅由于将--block-size=KB选项赋予ls--block-size=KB

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM