简体   繁体   English

使用Unix排序使用小数部分对pos / neg数进行排序

[英]Sorting pos/neg numbers with fractional parts using Unix sort

Using sort (coreutils) 5.2.1 使用sort (coreutils) 5.2.1

I have the following file, which I'd like to sort by the non-integer part of field 4. This can be a negative or positive number, and might also have the value INF. 我有以下文件,我想按字段4的非整数部分排序。这可以是负数或正数,也可能具有值INF。

field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=0.123 field5 field6
field1 field2 field3 tag=4.22 field5 field6
field1 field2 field3 tag=5.77 field5 field6
field1 field2 field3 tag=-1.92 field5 field6
field1 field2 field3 tag=-1.91 field5 field6
field1 field2 field3 tag=INF field5 field6

I would like this to be sorted as 我希望将其排序为

field1 field2 field3 tag=-1.92 field5 field6
field1 field2 field3 tag=-1.91 field5 field6
field1 field2 field3 tag=0.123 field5 field6
field1 field2 field3 tag=4.22 field5 field6
field1 field2 field3 tag=5.77 field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6

Given that the number part of the field is at character position 4 (assuming the indexing starts at 0, and I'm not sure of this), I have tried sort with the following options: 鉴于字段的数字部分位于字符位置4(假设索引从0开始,我不确定),我尝试使用以下选项进行sort

  • sort -g -k4.4 inputfile
  • sort -g -k4.5 inputfile
  • sort -n -k4.4 inputfile
  • sort -n -k4.5 inputfile
  • sort -g inputfile

These all yield the following, which is close, but not quite right. 这些都产生以下结果,这是接近但不太正确。 The magnitudes are sorted correctly, but I'd like the most negative value on top. 大小正确排序,但我想要最负面值。

field1 field2 field3 tag=0.123 field5 field6
field1 field2 field3 tag=-1.91 field5 field6
field1 field2 field3 tag=-1.92 field5 field6
field1 field2 field3 tag=4.22 field5 field6
field1 field2 field3 tag=5.77 field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6

How can I make sort behave? 我怎样才能使sort行为?

FWIW, here's more information: FWIW,这里有更多信息:

LANG = en_US.UTF-8
Red Hat Enterprise Linux WS release 4 (Nahant Update 6)

You could add a pre-processing awk step that adds a new field at the end containing the numeric portion or the numeric representation from field 4, and sort by this field. 您可以添加一个预处理awk步骤,该步骤在末尾添加一个包含数字部分或字段4中的数字表示的新字段,并按此字段排序。 Add a post-processing step to strip this field. 添加后处理步骤以去除此字段。 Note that in the example below, INF has been set to an arbitrary high value of 10**10 , you can set it to a higher value if you have a naturally occurring number in the input that exceeds this value 请注意,在下面的示例中, INF已设置为10**10的任意高值,如果输入中的自然出现的数字超过此值,则可以将其设置为更高的值

awk '{x=$4; sub("tag=", "", x); sub("INF", 10**10, x); print $0, x}' file.txt |
sort -k7,7g | 
cut -f-6 -d' '
field1 field2 field3 tag=-1.92 field5 field6
field1 field2 field3 tag=-1.91 field5 field6
field1 field2 field3 tag=0.123 field5 field6
field1 field2 field3 tag=4.22 field5 field6
field1 field2 field3 tag=5.77 field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6

I am on a Mac, so it may be a slightly different implementation, but I found this to work: 我在Mac上,所以它可能是一个稍微不同的实现,但我发现这个工作:

sort -gb -k 4.5,4 inputfile

In English: " sort , in a -g eneral numeric fashion, ignoring -b lanks, the file inputfile using the 4 th -k(c) olumn's data, from the 5 th element in that column to the end of the data in the 4 th column" 在英语中:“之类的 ,在一个-g ENERAL数字方式,忽略该列-b lanks,文件inputfile中使用第4 -k(c)中 olumn的数据,从5个元素在所述数据的末尾第4栏“

field1 field2 field3 tag=-1.92 field5 field6
field1 field2 field3 tag=-1.91 field5 field6
field1 field2 field3 tag=0.123 field5 field6
field1 field2 field3 tag=4.22 field5 field6
field1 field2 field3 tag=5.77 field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6
field1 field2 field3 tag=INF field5 field6

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM