[英]Sort values of output of ls -alh without using -S option
I have a script that recursively goes through directories and appends the result of running ls -alh --block-size=KB | grep ^\\-
我有一个递归遍历目录的脚本,并附加了运行ls -alh --block-size=KB | grep ^\\-
ls -alh --block-size=KB | grep ^\\-
to a file. ls -alh --block-size=KB | grep ^\\-
文件。 I then need to sort the resulting file by decreasing file size in the same way that using the -S
option would if it was used at the point where ls
was called. 然后,我需要通过减小文件大小来对结果文件进行排序,就像在调用ls
时使用-S
选项一样。
The many issues of trying to parse ls
are covered well in Why you shouldn't parse the output of ls and Fixing Unix/Linux/POSIX Filenames for an idea of what others have tried before you. 为何不应该解析ls的输出以及如何修复Unix / Linux / POSIX文件名 ,都很好地介绍了尝试解析ls
的许多问题,以了解其他人在您之前尝试过的内容。
Some additional reasons that your approach will not work reliably: 您的方法无法可靠运行的一些其他原因:
ls
may add a column to show the new device ID and throw off your sorting and parsing; 如果递归到设备边界,则某些版本的ls
可能会添加一列以显示新的设备ID并取消您的排序和解析。 sed
to remove the kB / mB / gB magnitude of the output of ls -h
. 您正在使用sed
删除ls -h
输出的kB / mB / gB幅度。 That will sort a 2 byte file, 2 kilobyte file and 2 megabyte file all together as the same size. 这样会将2字节的文件,2千字节的文件和2兆字节的文件全部排序为相同的大小。 ls
changes when you pipe it or display at the terminal also changing the logic of the parsing / sorting. 当您通过管道传输或在终端上显示时, ls
的输出也会更改,这也会更改解析/排序的逻辑。 The solution is to use a glob and sort based on an added column to the output of ls
. 解决方案是使用glob并根据ls
的输出添加的列进行排序。
We can use dd
to create a list of test files of some known sizes: 我们可以使用dd
创建一些已知大小的测试文件的列表:
dd if=/dev/zero of=A bs=2 count=1
dd if=/dev/zero of=B bs=1024 count=2
dd if=/dev/zero of=C bs=1024 count=3
dd if=/dev/zero of=D bs=1024 count=150
dd if=/dev/zero of=E bs=1024 count=2000
Resulting in: 导致:
$ ls -lh *
-rw-r--r-- 1 andrew wheel 2B Jan 8 20:52 A
-rw-r--r-- 1 andrew wheel 2.0K Jan 8 20:52 B
-rw-r--r-- 1 andrew wheel 3.0K Jan 8 20:52 C
-rw-r--r-- 1 andrew wheel 150K Jan 8 20:52 D
-rw-r--r-- 1 andrew wheel 2.0M Jan 8 20:52 E
If you sort the output of ls
by the -S
switch: 如果通过-S
开关对ls
的输出进行排序:
$ ls -lhS *
-rw-r--r-- 1 andrew wheel 2.0M Jan 8 20:52 E
-rw-r--r-- 1 andrew wheel 150K Jan 8 20:52 D
-rw-r--r-- 1 andrew wheel 3.0K Jan 8 20:52 C
-rw-r--r-- 1 andrew wheel 2.0K Jan 8 20:52 B
-rw-r--r-- 1 andrew wheel 2B Jan 8 20:52 A
You approach would remove the M
K
or B
in column five and then sort on that. 您的方法是删除第五列中的M
K
或B
,然后对其进行排序。 That would result in A, B and E sorting together. 这将导致A,B和E一起排序。
(It is possible to crudely sort the output of ls
like so: ( 可以像这样对ls
的输出进行粗略排序:
$ ls -al | grep ^\- | sort -nrk 5
-rw-r--r-- 1 andrew wheel 2048000 Jan 8 20:52 E
-rw-r--r-- 1 andrew wheel 153600 Jan 8 20:52 D
-rw-r--r-- 1 andrew wheel 3072 Jan 8 20:52 C
-rw-r--r-- 1 andrew wheel 2048 Jan 8 20:52 B
-rw-r--r-- 1 andrew wheel 2 Jan 8 20:52 A
but that does not produce the output of -h
that you have...) 但这不会产生-h
的输出...)
The correct way is to do this is to use a Decorate / Sort / Undecorate pattern with a glob. 正确的方法是使用带装饰的装饰/排序/非装饰图案 。
for fn in *; do
[ -f "$fn" ] || continue
c1=$(($(wc -c < "$fn")))
c2=$(ls -alh "$fn")
printf "%s\t%s\n" "$c1" "$c2"
done | sort -nrk 1 | cut -f 2
Result: 结果:
-rw-r--r-- 1 andrew wheel 2.0M Jan 8 20:52 E
-rw-r--r-- 1 andrew wheel 150K Jan 8 20:52 D
-rw-r--r-- 1 andrew wheel 3.0K Jan 8 20:52 C
-rw-r--r-- 1 andrew wheel 2.0K Jan 8 20:52 B
-rw-r--r-- 1 andrew wheel 2B Jan 8 20:52 A
Which is the same as using ls -lhS
与使用ls -lhS
相同
If you are recursing a file tree and writing to a file, the general methodology is the same. 如果要递归文件树并写入文件,则一般方法是相同的。
My solution which is good enough for my purposes, though the accepted answer is much better: 尽管可以接受的答案要好得多,但我的解决方案足以满足我的目的:
sed 's/kB//' files.tmp > files1.tmp #remove first instance of "kB" from each line
sed 's/ \+/ /g' files1.tmp > files2.tmp #replace all multiple spaces with single space
sort -k 5n,5 files2.tmp | tac > files3.tmp #sort by numeric file size and reverse
This only works due to giving the --block-size=KB
option to ls
. 这仅由于将--block-size=KB
选项赋予ls
而--block-size=KB
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.