[英]Sorting a space delimited list with uneven spaces
I have a space delimited list that has an uneven amount of spaces in what would be the first column.我有了这将是第一列中的空间大小不均空格分隔的列表。 I want to reverse sort this by the first number that appears after its string.
我想按其字符串后出现的第一个数字对其进行反向排序。 I need to do this using bash commands.
我需要使用 bash 命令来做到这一点。
Example:例子:
Pontiac Firebird 19.0 6 250.0 100.0 3282. 15.0 71 US
Pontiac J2000 SE Hatchback 31.0 4 112.0 85.00 2575. 16.2 82 US
Oldsmobile Delta 88 Royale 12.0 8 350.0 160.0 4456. 13.5 72 US
Oldsmobile Omega 11.0 8 350.0 180.0 3664. 11.0 73 US
AMC Gremlin 20.0 6 232.0 100.0 2914. 16.0 75 US
AMC Gremlin 21.0 6 199.0 90.00 2648. 15.0 70 US
Pontiac Lemans V6 21.5 6 231.0 115.0 3245. 15.4 79 US
Would turn into:会变成:
Oldsmobile Omega 11.0 8 350.0 180.0 3664. 11.0 73 US
Oldsmobile Delta 88 Royale 12.0 8 350.0 160.0 4456. 13.5 72 US
Pontiac Firebird 19.0 6 250.0 100.0 3282. 15.0 71 US
AMC Gremlin 20.0 6 232.0 100.0 2914. 16.0 75 US
AMC Gremlin 21.0 6 199.0 90.00 2648. 15.0 70 US
Pontiac Lemans V6 21.5 6 231.0 115.0 3245. 15.4 79 US
Pontiac J2000 SE Hatchback 31.0 4 112.0 85.00 2575. 16.2 82 US
I've tried doing sort -nr
to see what happens and it reverse sorts the list, but respective to it's alphabetized order.我试过执行
sort -nr
来查看会发生什么,它对列表进行反向排序,但相对于它的字母顺序。 I want to sort based on all values.我想根据所有值进行排序。
The trick is that I must keep it space delimited.诀窍是我必须保持空间分隔。 What's the best way to do this using bash?
使用 bash 执行此操作的最佳方法是什么?
I must keep it space delimited
我必须保持空间分隔
You mean, the result has to be space delimited again, right?您的意思是,结果必须再次以空格分隔,对吗? During processing, you can transform the input however you like.
在处理过程中,您可以随意转换输入。
Assuming you know a character that never appears in your file otherwise, delimit the value you want to sort with by that character using sed
, then sort by that value, then remove the additional delimiters again.假设您知道一个从未出现在您的文件中的字符,请使用
sed
分隔要按该字符排序的值,然后按该值排序,然后再次删除其他分隔符。
Here we use a tab to delimit the key for sorting.这里我们使用制表符来分隔排序的键。
sed -E 's/ ([0-9]+\.[0-9]+) / \t\1\t /' | sort -t $'\t' -k2,2n | tr -d \\t
This is basically a Schwartzian transform .这基本上是一个施瓦兹变换。
here's a short ruby program:这是一个简短的 ruby 程序:
ruby -e '
puts IO.readlines(ARGV.shift, chomp: true)
.map {|line|
fields = line.split
[fields[0..(fields.size - 9)].join(" ")] + fields[-8 .. -1]
}
.sort_by {|row| row[1]}
.map {|row| row.join(" ")}
.join("\n")
' file
I would use GNU AWK
for this as follows, let file.txt
content be我将为此使用 GNU
AWK
,如下所示,让file.txt
内容为
Pontiac Firebird 19.0 6 250.0 100.0 3282. 15.0 71 US
Pontiac J2000 SE Hatchback 31.0 4 112.0 85.00 2575. 16.2 82 US
Oldsmobile Delta 88 Royale 12.0 8 350.0 160.0 4456. 13.5 72 US
Oldsmobile Omega 11.0 8 350.0 180.0 3664. 11.0 73 US
AMC Gremlin 20.0 6 232.0 100.0 2914. 16.0 75 US
AMC Gremlin 21.0 6 199.0 90.00 2648. 15.0 70 US
Pontiac Lemans V6 21.5 6 231.0 115.0 3245. 15.4 79 US
then然后
awk 'BEGIN{FPAT="[0-9]*[.][0-9]*";PROCINFO["sorted_in"]="@ind_num_asc"}{arr[$1]=$0}END{for(i in arr){print arr[i]}}' file.txt
output输出
Oldsmobile Omega 11.0 8 350.0 180.0 3664. 11.0 73 US
Oldsmobile Delta 88 Royale 12.0 8 350.0 160.0 4456. 13.5 72 US
Pontiac Firebird 19.0 6 250.0 100.0 3282. 15.0 71 US
AMC Gremlin 20.0 6 232.0 100.0 2914. 16.0 75 US
AMC Gremlin 21.0 6 199.0 90.00 2648. 15.0 70 US
Pontiac Lemans V6 21.5 6 231.0 115.0 3245. 15.4 79 US
Pontiac J2000 SE Hatchback 31.0 4 112.0 85.00 2575. 16.2 82 US
Explanation: I inform GNU AWK
that field is 0 or more digits followed by literal dot ( [.]
) followed by 0 or more digits (note: I assume that there will always be dot in first number and never dot in column with name) and that array traversal should be treat-indices-as-numbers-ascending which is one of Predefined Array Scanning Orders .说明:我通知 GNU
AWK
该字段是 0 个或多个数字,后跟文字点( [.]
),然后是 0 个或多个数字(注意:我假设第一个数字中总是有点,而名称列中永远不会有点)并且该数组遍历应该是 Treat-indices-as-numbers-ascending 这是Predefined Array Scanning Orders 之一。 For each line I add to array pair with key being first number ( $1
) and value being whole line ( $0
).对于每一行,我添加到数组对中,键是第一个数字(
$1
),值是整行( $0
)。 After going through all lines I print
values from array arr
with order which observe selected array traversal.在完成所有行后,我从数组
arr
print
值,并按照观察选定数组遍历的顺序进行print
。
(tested in gawk 4.2.1) (在 gawk 4.2.1 中测试)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.