简体   繁体   中英

Bash Sorting a List by Columns

I'm reverse sorting column 2. As for column 1, if multiple lines have the same $2 value, I want them to be sorted in a reverse order. I have stored this list in a variable at the moment in a bash script. Is there a sed or awk function to be used?

My output right now, for example, is:

123, 3
124, 3
12345, 2
898, 1
1010, 1

what I want is:

124, 3
123, 3
12345, 2
1010, 1
898, 1

It's not a trivial awk script, but it's not hard either. You simply use an array a[] below, to store the values for the first field for equal values of the second field. If last is set (eg not the first record) and the second field changes, you output the current array and reset the array (that is Rule 1).

In Rule 2, you just scan through the existing array and insert the current first field in the array in order. You keep the last value of the second field so you know when it changes. You use the END rule to output the last set of values, eg

awk -F, '
    last && $2 != last {
        for (i=1; i<=n; i++)
            print a[i]", "last;
        delete a
        n = 0
    }
    {
        swapped=0
        for (i=1; i<=n; i++)
            if ($1 > a[i]) {
                swapped=1
                for (j=n+1; j>i; j--)
                    a[j]=a[j-1]
                a[i]=$1
            }
        if (!swapped)
            a[++n]=$1
        else
            n++
        last=$2
    }
END {
    for (i=1; i<=n; i++)
        print a[i]", "last
    }
' file

The swapped flag just tells you whether the current first-field was inserted into the array before an existing element ( swapped == 1 ) or if it was just added at the end ( swapped == 0 ).

Example Use/Output

With your sample file in the file named file , you can simply change to the directory that contains it, select the script above with the mouse (change the filename to what yours is) and then middle-mouse-paste the script into the terminal, eg

$ awk -F, '
>     last && $2 != last {
>         for (i=1; i<=n; i++)
>             print a[i]", "last;
>         delete a
>         n = 0
>     }
>     {
>         swapped=0
>         for (i=1; i<=n; i++)
>             if ($1 > a[i]) {
>                 swapped=1
>                 for (j=n+1; j>i; j--)
>                     a[j]=a[j-1]
>                 a[i]=$1
>             }
>         if (!swapped)
>             a[++n]=$1
>         else
>             n++
>         last=$2
>     }
> END {
>     for (i=1; i<=n; i++)
>         print a[i]", "last
>     }
> ' file
124,  3
123,  3
12345,  2
1010,  1
898,  1

Look things over and let me know if you have questions.

Use a combination of Perl one-liners and sort . The one-liners convert the , delimiter into tab (and back). And sort uses the -r option for reverse, and -g option for numeric sort. Option -kN,N specifies to sort just by field N , here 2nd, then 1st field.

perl -pe 's/, /\t/' in_file | sort -k2,2gr -k1,1gr | perl -pe 's/\t/, /' > out_file

For example:

Create example input file:

cat > foo <<EOF
123, 3
124, 3
12345, 2
898, 1
1010, 1
EOF

Run the command:

cat foo | perl -pe 's/, /\t/' | sort -k2,2gr -k1,1gr | perl -pe 's/\t/, /' 

Output:

124, 3
123, 3
12345, 2
1010, 1
898, 1

The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-p : Loop over the input one line at a time, assigning it to $_ by default. Add print $_ after each loop iteration.

SEE ALSO:
perldoc perlrun : how to execute the Perl interpreter: command line switches
perldoc perlrequick : Perl regular expressions quick start

Also with awk , you can try this:

awk 'BEGIN{RS=""; OFS=FS="\n"} {tmp2 = $2; $2 = $1; $1 = tmp2; tmp5=$5; $5=$4; $4=tmp5}1' file
124, 3
123, 3
12345, 2
1010, 1
898, 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM