简体   繁体   中英

UNIX shell: sort a string by word length and by ASCII order ignoring case

I would like to sort a string by length and then by ASCII order(upper and lower case are equal) with a unix command.

string = [a-z][A-Z][0-9]

For example:

"A a b B cc ca cd" : 
=> A a b B
=> ca cc cd

"Hello stackoverflow how are you today"
=> are how you
=> Hello today
=> stackoverflow

I wrote an ugly (maybe) awk|sort|awk line to do the job. it could be done in one awk process too, however, I am a bit lazy, just go to the dirty and quick way.

echo yourStr|awk '{
split($0,o); for(x in o) print length(o[x]),o[x]}'|sort -n|awk '!p{printf $2;p=$1;next}$1==p{printf " "$2}$1!=p{printf "\n"$2;p=$1}' 

let's take an example:

"Hello stackoverflow how are you today foo bar xoo yoo ooo"

try with above line:

kent$  echo "Hello stackoverflow how are you today foo bar xoo yoo ooo"|awk '{
split($0,o); for(x in o) print length(o[x]),o[x]}'|sort -n|awk '!p{printf $2;p=$1;next}$1==p{printf " "$2}$1!=p{printf "\n"$2;p=$1}'
are bar foo how ooo xoo yoo you
Hello today
stackoverflow     

test with your first example:

kent$  echo "A a b B cc ca cd" |awk '{
pipe quote> split($0,o); for(x in o) print length(o[x]),o[x]}'|sort -n|awk '!p{printf $2;p=$1;next}$1==p{printf " "$2}$1!=p{printf "\n"$2;p=$1}' 
a A b B
ca cc cd

Here's one way using GNU awk . Run like:

awk -f script.awk file

Contents of script.awk :

BEGIN {
    IGNORECASE=1
}

{
    for(i=1;i<=NF;i++) {
        a[length($i)][$i]++
    }
}

END {

    for (i in a) {
        b[x++] = i + 0
    }

    n = asort(b)

    for (j=1;j<=n;j++) {

        m = asorti(a[b[j]],c)

        for (k=1;k<=m;k++) {

            for (l=1;l<=a[b[j]][c[k]];l++) {
                r = (r ? r FS : "") c[k]
            }

            s = (s ? s FS : "") r
            r = ""
        }

        print s
        s = ""
    }
}

Results using your input, concatenated:

A a B b
ca cc cd
are how you
Hello today
stackoverflow

Alternatively, here's the one-liner:

awk '{ for(i=1;i<=NF;i++) a[length($i)][$i]++ } END { for (i in a) b[x++] = i + 0; n = asort(b); for (j=1;j<=n;j++) { m = asorti(a[b[j]],c); for (k=1;k<=m;k++) { for (l=1;l<=a[b[j]][c[k]];l++) r = (r ? r FS : "") c[k]; s = (s ? s FS : "") r; r = "" } print s; s="" } }' IGNORECASE=1 file

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM