简体   繁体   中英

Linux Sorting based on different column in file names

Can you please help me on sorting the file names by a couple of conditions?

ls -tr | grep ${DATE}* | sort -k1

        dboption_01beforeschemasize_20200710-092914_A_IS_CRB_CDS.sql
        dboption_01beforeschemasize_20200710-092914_A_IS_CRB_CTL.sql
        dboption_02beforetablesize_20200710-092914_A_IS_CRB_CDS.sql
        dboption_02beforetablesize_20200710-092914_A_IS_CRB_CTL.sql
        dboption_03create_table_20200710-092914_A_IS_CRB_CDS.sql
        dboption_03create_table_20200710-092914_A_IS_CRB_CTL.sql
        dboption_04Export_DDL_AFTER_CHANGE_20200710-092914_A_IS_CRB_CDS.sh
        dboption_04Export_DDL_AFTER_CHANGE_20200710-092914_A_IS_CRB_CTL.sh
        dboption_05drop_table_20200710-092914_A_IS_CRB_CDS.sql
        dboption_05drop_table_20200710-092914_A_IS_CRB_CTL.sql
        dboption_06aftertablesize_20200710-092914_A_IS_CRB_CDS.sql
        dboption_06aftertablesize_20200710-092914_A_IS_CRB_CTL.sql
        dboption_07afterschemasize_20200710-092914_A_IS_CRB_CDS.sql
        dboption_07afterschemasize_20200710-092914_A_IS_CRB_CTL.sql

I want the output should be: database, schema, and then file number

A_IS_CRB is the db and CTL,CDS are schema. (It can also have different db names)

I want to process all 7 files for one database one schema and then proceed with other 7 files of same database different schema or different database with some schema.

I tried a couple of things:

    ls -tr | grep ${DATE}* | sort -k1
    ls -tr | grep ${DATE}* | sort -t $'_'  -k4 -k5 -k2,2
    ls -tr | grep ${DATE}* | grep "  awk -F'[0-9]_' '{print $NF}' |awk -F_ '{print $NF}' |sed 's/.sql//' |sed 's/.sh//' | sed 's/\_$//'| uniq"  (to grep schema)
    

No luck, any help much appreciated. The desired output is:

 dboption_01beforeschemasize_20200710-092914_A_IS_CRB_CDS.sql
    dboption_02beforetablesize_20200710-092914_A_IS_CRB_CDS.sql
    dboption_03create_table_20200710-092914_A_IS_CRB_CDS.sql
    dboption_04Export_DDL_AFTER_CHANGE_20200710-092914_A_IS_CRB_CDS.sh
    dboption_05drop_table_20200710-092914_A_IS_CRB_CDS.sql
    dboption_06aftertablesize_20200710-092914_A_IS_CRB_CDS.sql
    dboption_07afterschemasize_20200710-092914_A_IS_CRB_CDS.sql
    dboption_01beforeschemasize_20200710-092914_A_IS_CRB_CTL.sql
    dboption_02beforetablesize_20200710-092914_A_IS_CRB_CTL.sql
    dboption_03create_table_20200710-092914_A_IS_CRB_CTL.sql
    dboption_04Export_DDL_AFTER_CHANGE_20200710-092914_A_IS_CRB_CTL.sh
    dboption_05drop_table_20200710-092914_A_IS_CRB_CTL.sql
    dboption_06aftertablesize_20200710-092914_A_IS_CRB_CTL.sql
    dboption_07afterschemasize_20200710-092914_A_IS_CRB_CTL.sql

I isolated the names of the files in two tables then I displayed them at the end of the processing

awk '$0 ~ /_CDS/{cds[$0]} $0 ~ /_CTL/{ ctl[$0]} END{for(i in cds){print i} for(ii in ctl){print ii}}'  your_file

Tell me if this solution is right for you.

If I understand what you want, you want to sort by db-schema so you can process 1-7 of the files with CDS schema and then 1-7 with the CTL schema. You can do that using awk split to isolate the db-schema, outputting the entire record followed by the db-schema to allow sorting and then use awk again to drop the second db-schema sort column, eg

awk -F'-' '{split($2,a,"_"); print $0" "substr(a[5],1,3)}' listing | 
sort -k2 | 
awk '{print $1}'

Example Output

With your input in the listing file, you would receive:

dboption_01beforeschemasize_20200710-092914_A_IS_CRB_CDS.sql
dboption_02beforetablesize_20200710-092914_A_IS_CRB_CDS.sql
dboption_03create_table_20200710-092914_A_IS_CRB_CDS.sql
dboption_04Export_DDL_AFTER_CHANGE_20200710-092914_A_IS_CRB_CDS.sh
dboption_05drop_table_20200710-092914_A_IS_CRB_CDS.sql
dboption_06aftertablesize_20200710-092914_A_IS_CRB_CDS.sql
dboption_07afterschemasize_20200710-092914_A_IS_CRB_CDS.sql
dboption_01beforeschemasize_20200710-092914_A_IS_CRB_CTL.sql
dboption_02beforetablesize_20200710-092914_A_IS_CRB_CTL.sql
dboption_03create_table_20200710-092914_A_IS_CRB_CTL.sql
dboption_04Export_DDL_AFTER_CHANGE_20200710-092914_A_IS_CRB_CTL.sh
dboption_05drop_table_20200710-092914_A_IS_CRB_CTL.sql
dboption_06aftertablesize_20200710-092914_A_IS_CRB_CTL.sql
dboption_07afterschemasize_20200710-092914_A_IS_CRB_CTL.sql

Let me know if you need changes to this output.


Edit - Update Per-Format Change Delimiting dbname and schema with '-'

In the comments when you advised that the db-name and db-schema were not fixed with the db-name having two '_' delimiters and none in the db-schema, that created a problem where what constituted the db-name and what was the db-schema was now ambiguous. There being no way to know whether you have a 3-part (two '_' ) name and 2-part (one '_' ) schema or a 4-part name (three '_' ) and a 1-part schema (no '_' ) (or any of the other 6 or 7 combinations between 3-5 part names and 1-3 part schema).

Adding the '-' as the delimiter between the db-name and db-schema now provides a non-ambiguous way to isolate the db-schema from the filename regardless of the number of parts separated by '_' in the db-name and db-schema. You can use '-' as the delimiter for awk and then $NF becomes the last field. (db-schema plus extension). Then using substr($NF, 1, match($1, /[.]/) - 1) you can isolate the db-schema alone.

awk -F'-' '{ print $0" "substr($NF,1,match($NF,/[.]/)-1) }' listing | 
sort -k2 | 
awk '{print $1}'

Short Example Input

$ cat listing
dboption_01beforeschemasize_20200710-092914_A_FOO_IS_CRB-PDO_CDS.sql
dboption_01beforeschemasize_20200710-092914_A_FOO_IS_CRB-PDO_CTS.sql
dboption_02beforetablesize_20200710-092914_A_FOO_IS_CRB-PDO_CDS.sql
dboption_02beforetablesize_20200710-092914_A_FOO_IS_CRB-PDO_CTS.sql
dboption_03create_table_20200710-092914_A_FOO_IS_CRB-PDO_CDS.sql
dboption_03create_table_20200710-092914_A_FOO_IS_CRB-PDO_CTS.sql

Example Use/Output

$ awk -F'-' '{ print $0" "substr($NF,1,match($NF,/[.]/)-1) }' listing |
> sort -k2 |
> awk '{print $1}'
dboption_01beforeschemasize_20200710-092914_A_FOO_IS_CRB-PDO_CDS.sql
dboption_02beforetablesize_20200710-092914_A_FOO_IS_CRB-PDO_CDS.sql
dboption_03create_table_20200710-092914_A_FOO_IS_CRB-PDO_CDS.sql
dboption_01beforeschemasize_20200710-092914_A_FOO_IS_CRB-PDO_CTS.sql
dboption_02beforetablesize_20200710-092914_A_FOO_IS_CRB-PDO_CTS.sql
dboption_03create_table_20200710-092914_A_FOO_IS_CRB-PDO_CTS.sql

If you want to maintain the extension as part of the db-schema, (I see you have both .sql and .sh extensions) then simply use the following as the first awk command

awk -F'-' '{ print $0" "$NF }` listing

Give it a go with your updated names and let me know if there are any hiccups.


Additional Sort By DBSchema, then DBName, then FileNo.

In order to sort by all the additional parameters you list, you will need to change the primary field delimiter to something that allows each of the fields to be separated and the information needed to sort extracted from the fields. A good choice would simply be to use '-' to separate the fields as:

Option  fileno_stuff  date  time  dbname  dbschema

That would correspond to an example record of, eg:

dboption-03create_table-20200710-092914-FOO_PDA-BAR_CDS.sql

If you make those changes to your listing, then you can add three columns to your listing, (eg fileno , dbname , dbschema ) allowing you to then sort -k4 -k3 -k2n . To append the fields and sort the new data, you could do:

awk -F'-' '{print $0" "substr($2,1,match($2,/[^0-9]+/)-1)+0" "$(NF-1)" "substr($NF,1,match($NF,/[.]/)-1)}' listing | 
sort -k4 -k3 -k2n | 
awk '{print $1}'

Example Input Listing

dboption-01beforeschemasize-20200710-092914_A_IS_CRB-CDS.sql
dboption-01beforeschemasize-20200710-092914_A_IS_CRB-CDT.sql
dboption-01beforeschemasize-20200710-092914_PDA-CDS.sql
dboption-02beforetablesize-20200710-092914_A_IS_CRB-CDS.sql
dboption-02beforetablesize-20200710-092914_A_IS_CRB-CDT.sql
dboption-02beforetablesize-20200710-092914_PDA-CDS.sql
dboption-03create_table-20200710-092914_A_IS_CRB-CDS.sql
dboption-03create_table-20200710-092914_A_IS_CRB-CDT.sql
dboption-03create_table-20200710-092914_PDA-CDS.sql

Sorted Output

dboption-01beforeschemasize-20200710-092914_A_IS_CRB-CDS.sql
dboption-02beforetablesize-20200710-092914_A_IS_CRB-CDS.sql
dboption-03create_table-20200710-092914_A_IS_CRB-CDS.sql
dboption-01beforeschemasize-20200710-092914_PDA-CDS.sql
dboption-02beforetablesize-20200710-092914_PDA-CDS.sql
dboption-03create_table-20200710-092914_PDA-CDS.sql
dboption-01beforeschemasize-20200710-092914_A_IS_CRB-CDT.sql
dboption-02beforetablesize-20200710-092914_A_IS_CRB-CDT.sql
dboption-03create_table-20200710-092914_A_IS_CRB-CDT.sql

When you have a reformatted listing, give it a try and let me know of any issues.

Look at this:

 sort -t '-' -k2  list

It's may be the good solution for you. Tel me if it's what you want.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM