Editor's note: The original title of the question mentioned tabs as the field separators.
In a text such as
500 east 23rd avenue Toronto 2 890 400000 1
900 west yellovillage blvd Mississauga 3 800 600090 3
how would you sort in ascending order of the second to last column?
Editor's note: The OP later provided another sample input line, 500 Jackson Blvd Toronto 3 700 40000 2
, which contains only 8 whitespace-separated input fields (compared to the 9 above), revealing the need to deal with a variable number of fields in the input.
Note: There are several, potentially separate questions:
Update : Question C was the relevant one.
Question A: As implied by the question's title only : how can you use the tab character ( \\t
) as the field separator?
Question B: How can you sort input by the second-to-last field, without knowing that field's specific index up front, given a fixed number of fields?
Question C: How can you sort input by the second-to-last field, without knowing that field's respective index up front, given a variable number of fields?
Answer to question A:
sort
's -t
option allows you to specify a field separator. By default, sort
uses any run of line-interior whitespace as the separator.
Assuming Bash, Ksh, or Zsh, you can use an ANSI C-quoted string ( $'...'
) to specify a single tab as the field separator ( $'\\t'
):
sort -t $'\t' -n -k8,8 file # -n sorts numerically; omit for lexical sorting
Answer to question B:
Note: This assumes that all input lines have the same number of fields, and that input comes from file file
:
# Determine the index of the next-to-last column, based on the first
# line, using Awk:
nextToLastColNdx=$(head -n 1 file | awk -F '\t' '{ print NF - 1 }')
# Sort numerically by the next-to-last column (omit -n to sort lexically):
sort -t $'\t' -n -k$nextToLastColNdx,$nextToLastColNdx file
Note: To sort by a single field, always specify it as the end field too (eg, -k8,8
), as above, because sort
, given only a start field index (eg, -k8
), sorts from the specified field through the remainder of the line .
Answer to question C:
Note: This assumes that input lines may have a variable number of fields, and that on each line it is that line's second-to-last field that should act as the sort field; input comes from file file
:
awk '{ printf "%s\t%s\n", $(NF-1), $0 }' file |
sort -n -k1,1 | # omit -n to perform lexical sorting
cut -f2-
awk
command extracts each line's second-to-last field and prepends it to the input line on output, separated by a tab. cut
. I suggest looking at "man sort".
You will see how to specify a field separator and how to specify the field index that should be used as a key for sorting.
You can use sort -k 2
For example :
echo -e '000 west \n500 east\n500 east\n900 west' | sort -k 2
The result is :
500 east
500 east
900 west
000 west
You can find more informations in the man page of sort. Take a look a the end of the man page. Just before author you have some interesting informations :)
Bye
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.