简体   繁体   中英

Bash sort -n option explanation requested

I have a number of IP addresses that are extracted and then sorted for further display. The relevant code is:

| cut -w -f11 | sort -t. -k1,4  -u -n
69.156.151.245
99.226.129.44
108.170.136.226
142.126.92.197

However, this excludes a known address which shows up if the -n option is dropped:

| cut -w -f11 | sort -t. -k1,4 -u
108.170.136.226
142.126.92.197
69.156.151.245
69.156.7.43
99.226.129.44
99.255.53.67

And it reappears of the -u option is dropped instead:

| cut -w -f11 | sort -t. -k1,4 -n   
69.156.151.245
69.156.7.43
69.156.7.43
69.156.7.43
99.226.129.44
99.255.53.67
99.255.53.67
108.170.136.226
142.126.92.197

My question is: Why does the -n when combined with the -u option have the effect of removing 69.156.7.43 from the output. I can guess that it has something to do with 69.156.151.245 but what?

The answer given below produces this:

cut -w -f11 | sort -t. -k1,4  -u -V
69.156.7.43
69.156.151.245
99.226.129.44
99.255.53.67
108.170.136.226
142.126.92.197
216.185.71.41

This is an interesting edge case. -n option assumes numbers and numbers have one decimal point. Therefore the comparison for uniqueness is only the first two tokens. Workaround is using version sort instead.

... | sort -V -u 

Yeah that's a bit of a doosie... assuming the input is in the last snippet (and putting AA,BB,CC,DD,etc on end of lines to see what's going on we see this output..

| sort --debug -t. -k1,4 -n -u
Memory to be used for sorting: 4294967296
Number of CPUs: 4
Using collate rules of C locale
Byte sort is used
Positive sign: <+>
Negative sign: <->
sort_method=mergesort
; k1=<99.226.129.44 EE>, k2=<99.255.53.67 FF>; s1=<99.226.129.44 EE>, s2=<99.255.53.67 FF>; cmp1=0
; k1=<99.255.53.67 FF>, k2=<99.255.53.67 GG>; s1=<99.255.53.67 FF>, s2=<99.255.53.67 GG>; cmp1=0
; k1=<99.255.53.67 GG>, k2=<108.170.136.226 HH>; s1=<99.255.53.67 GG>, s2=<108.170.136.226 HH>; cmp1=-1
; k1=<108.170.136.226 HH>, k2=<142.126.92.197 II>; s1=<108.170.136.226 HH>, s2=<142.126.92.197 II>; cmp1=-1
; k1=<69.156.151.245 AAA>, k2=<69.156.7.43 BB>; s1=<69.156.151.245 AAA>, s2=<69.156.7.43 BB>; cmp1=0
; k1=<69.156.7.43 CC>, k2=<69.156.7.43 DD>; s1=<69.156.7.43 CC>, s2=<69.156.7.43 DD>; cmp1=0
; k1=<69.156.151.245 AAA>, k2=<69.156.7.43 CC>; s1=<69.156.151.245 AAA>, s2=<69.156.7.43 CC>; cmp1=0
; k1=<69.156.7.43 CC>, k2=<69.156.7.43 BB>; s1=<69.156.7.43 CC>, s2=<69.156.7.43 BB>; cmp1=0
; k1=<69.156.151.245 AAA>, k2=<99.226.129.44 EE>; s1=<69.156.151.245 AAA>, s2=<99.226.129.44 EE>; cmp1=-1
; k1=<99.226.129.44 EE>, k2=<69.156.7.43 BB>; s1=<99.226.129.44 EE>, s2=<69.156.7.43 BB>; cmp1=1
; k1=<99.226.129.44 EE>, k2=<69.156.7.43 CC>; s1=<99.226.129.44 EE>, s2=<69.156.7.43 CC>; cmp1=1
; k1=<99.226.129.44 EE>, k2=<69.156.7.43 DD>; s1=<99.226.129.44 EE>, s2=<69.156.7.43 DD>; cmp1=1
69.156.151.245 AAA
; k1=<69.156.151.245 AAA>, k2=<69.156.7.43 BB>; s1=<69.156.151.245 AAA>, s2=<69.156.7.43 BB>; cmp1=0
; k1=<69.156.151.245 AAA>, k2=<69.156.7.43 CC>; s1=<69.156.151.245 AAA>, s2=<69.156.7.43 CC>; cmp1=0
; k1=<69.156.151.245 AAA>, k2=<69.156.7.43 DD>; s1=<69.156.151.245 AAA>, s2=<69.156.7.43 DD>; cmp1=0
; k1=<69.156.151.245 AAA>, k2=<99.226.129.44 EE>; s1=<69.156.151.245 AAA>, s2=<99.226.129.44 EE>; cmp1=-1
99.226.129.44 EE
; k1=<99.226.129.44 EE>, k2=<99.255.53.67 FF>; s1=<99.226.129.44 EE>, s2=<99.255.53.67 FF>; cmp1=0
; k1=<99.226.129.44 EE>, k2=<99.255.53.67 GG>; s1=<99.226.129.44 EE>, s2=<99.255.53.67 GG>; cmp1=0
; k1=<99.226.129.44 EE>, k2=<108.170.136.226 HH>; s1=<99.226.129.44 EE>, s2=<108.170.136.226 HH>; cmp1=-1
108.170.136.226 HH
; k1=<108.170.136.226 HH>, k2=<142.126.92.197 II>; s1=<108.170.136.226 HH>, s2=<142.126.92.197 II>; cmp1=-1
142.126.92.197 II

Even if I drop the key definition it still sorts oddly..

You can however fix this (assuming you are after the minimal set of IPs ordered numerically) by splitting into two parts - go unique, then -n

| sort -u  | sort -n
69.156.151.245
69.156.7.43
99.226.129.44
99.255.53.67
108.170.136.226
142.126.92.197

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM