如何使用shell脚本对文本文件的内容进行排序

Question

I am new to shell scripting. 我是shell脚本的新手。 I am interested how to know how to sort a content of a file using shell scripting. 我感兴趣的是如何知道如何使用shell脚本对文件内容进行排序。

Here is an example: 这是一个例子：

fap0089-josh.baker
fap00233-adrian.edwards
fap00293-bob.boyle
fap00293-bob.jones
fap002-brian.lopez
fap00293-colby.morris
fap00293-cole.mitchell
psf0354-SKOWALSKI
psf0354-SLEE
psf0382-SLOWE
psf0391-SNOMURA
psf0354-SPATEL
psf0364-SRICHARDS
psf0354-SSEIBERT
psf0354-SSIRAH
bsi0004-STRAN
bsi0894-STURBIC
unit054-SUNDERWOOD

Considering the data above (this is a small set, I have more than 5.5 records), I would like to sort it like this: 考虑到上面的数据（这是一个小集，我有超过5.5条记录），我想这样排序：

Number of entries starting with fap,psf,bsi,unit etc... 以fap，psf，bsi，unit等开头的条目数...
The total number of environments for each type, ie: each numeric after the word, 0004,0382,054 etc are environments. 每种类型的环境总数，即：单词后面的每个数字，0004,0382,054等都是环境。 eg: psf has 4 unique environments. 例如：psf有4个独特的环境。
The sum total 总和

Answer 1

Here's a Schwarzian transform to sort by 1) leading letters, then 2) digits 这是一个Schwarzian变换，用1）前导字母，然后2）数字排序

sed -r 's/^([[:alpha:]]+)([[:digit:]]+)/\1 \2 /' filename | 
sort -t ' ' -k 1,1 -k 2,2n | 
sed 's/ //; s/ //'

output: 输出：

bsi0004-STRAN
bsi0894-STURBIC
fap002-brian.lopez
fap0089-josh.baker
fap00233-adrian.edwards
fap00293-bob.boyle
fap00293-bob.jones
fap00293-colby.morris
fap00293-cole.mitchell
psf0354-SKOWALSKI
psf0354-SLEE
psf0354-SPATEL
psf0354-SSEIBERT
psf0354-SSIRAH
psf0364-SRICHARDS
psf0382-SLOWE
psf0391-SNOMURA
unit054-SUNDERWOOD

To generate the metrics you mention, I'd use perl: 要生成您提到的指标，我将使用perl：

perl -nE '
    /^([[:alpha:]]+)(\d+)/ or next;
    $count{$1}++;
    $nenv{$1}{$2}=1;
    $total+=$2
} 
END {
    say "Counts:";
    say "$_ => $count{$_}" for sort keys %count;
    say "Number of environments";
    say "$_ => ", scalar keys %{$nenv{$_}} for sort keys %nenv;
    say "Total = $total";
' filename

Counts:
bsi => 2
fap => 7
psf => 8
unit => 1
Number of environments
bsi => 2
fap => 4
psf => 4
unit => 1
Total = 5355

Without using perl, it's less efficient because you have to read the file multiple times. 不使用perl，效率较低，因为您必须多次读取文件。

echo Counts:
sed 's/[0-9].*//' filename | sort | uniq -c 
echo Number of environments:
sed -r 's/^([a-z]+)([0-9]*).*/\1 \2/' filename | sort -u | cut -d" " -f1 | uniq -c
echo Total:
{ printf "%d+" $(sed -r 's/^[a-z0]+([0-9]*).*/\1/' filename); echo 0; } | bc

Counts:
      2 bsi
      7 fap
      8 psf
      1 unit
Number of environments:
      2 bsi
      4 fap
      4 psf
      1 unit
Total:
5355

如何使用shell脚本对文本文件的内容进行排序

问题描述

1 个解决方案

解决方案1
3 2014-12-02 15:39:32

如何使用shell脚本对文本文件的内容进行排序

问题描述

1 个解决方案

解决方案1 3 2014-12-02 15:39:32

解决方案1
3 2014-12-02 15:39:32