简体   繁体   中英

Using sort -k linux

Instructor gave us a code:

sort -k 3.7nbr -k 3.1nbr -k 3.4nbr distros.txt > distros-by-date.txt

that is supposed to sort this distros.txt file by date

Fedora:10:11/25/2008
SUSE:11.0:06/19/2008
Ubuntu:8.04:04/24/2008
Fedora:8:11/08/2007
SUSE:10.3:10/04/2007
Ubuntu:6.10:10/26/2006
Fedora:7:05/31/2007
Ubuntu:7.10:10/18/2007
Ubuntu:7.04:04/19/2007
SUSE:10.1:05/11/2006
Fedora:6:10/24/2006
Fedora:9:05/13/2008
Ubuntu:6.06:06/01/2006
Ubuntu:8.10:10/30/2008
Fedora:5:03/20/2006

Assuming that the code works, this is supposed to be the simplified version of the output

Fedora 10
Ubuntu 8.10
SUSE 11.0
Fedora 9
Ubuntu 8.04
Fedora 8
Ubuntu 7.10
SUSE 10.3
Fedora 7
Ubuntu 7.04

thing is, it doesn't work and I have trouble pinpointing whats wrong. I've read about it but the examples only use n , what about b and r ?, sometimes there are also spaces between -k and the key, sometimes not; lastly, sometimes theres a dot in between the key (3.7) as opposed to a comma (3,7). I tried reading the man page but I just can't wrap my head around it, can someone please explain?

If this matters, sometimes he uses a mac and that causes problems with the code, maybe it's the OS?

You have no field separator specification to tell sort it should be using a colon:

sort -t: -k 3.7nbr -k 3.1nbr -k 3.4nbr

And, to get the simplified output, you need only columns one and two, as per the following transcript:

$ sort -t: -k 3.7nbr -k 3.1nbr -k 3.4nbr inputfile | awk -F: '{print $1" "$2}'
Fedora 10
Ubuntu 8.10
SUSE 11.0
Fedora 9
Ubuntu 8.04
Fedora 8
Ubuntu 7.10
SUSE 10.3
Fedora 7
Ubuntu 7.04
Ubuntu 6.10
Fedora 6
Ubuntu 6.06
SUSE 10.1
Fedora 5

In terms of the flags, n means numeric comparison, b means ignore leading blanks (to presumably cover cases like 12/ 4/2022 ) and r mean reverse order (latest to earliest).

You are missing the option -t: which sets the field separator to : . Also, -k 3.4nbr is redundant, but it won't hurt.

What the man page says about -k :

KEYDEF is F[.C][OPTS][,F[.C][OPTS]] for start and stop position, where F is a field number and C a character position in the field; both are origin 1, and the stop position defaults to the line's end.

What that means:

A key specification (the thing that follows -k ) consists of a field number ( F ) optionally ( [...] ) followed by a period and a character offset ( .C ) and optional option characters [OPTS] ), which can be followed by a second field number and optional character offset.

If the character offset is missing, the key starts with the first character in the field.

The first field number/character offset defines the start of the field. If there is a second FC , then it defines the end of the field; otherwise, the field goes to the end of the line.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM