简体   繁体   English

使用Linux bash脚本语言在目录上获取排序的文件列表

[英]Get a sorted file list on a directory by using Linux bash script language

I need to list the files on a specified directory, and sort all available files on the basis of the date and time embedded in each file name, not by the file creation or modification date/time. 我需要列出指定目录中的文件,并根据每个文件名中嵌入的日期和时间对所有可用文件进行排序,而不是按文件创建或修改日期/时间进行排序。 My files have the following format: Before sort 我的文件具有以下格式:排序之前

RATMS-RNSCA_npp_d20131208_t0408392_e0417432_b00001_c20131223024038606000_all-_dev.h5
RATMS-RNSCA_npp_d20131208_t0547506_e0557586_b00001_c20131223021256522000_all-_dev.h5
RCRIS-RNSCA_npp_d20131208_t0408392_e0417432_b00001_c20131223024038506000_all-_dev.h5
RCRIS-RNSCA_npp_d20131208_t0548226_e0557586_b00001_c20131223021256270000_all-_dev.h5
RNSCA-ROLPS_npp_d20131208_t0408334_e0417550_b00001_c20131223024038619000_all-_dev.h5
RNSCA-ROLPS_npp_d20131208_t0548233_e0558223_b00001_c20131223021256591000_all-_dev.h5
RNSCA-RONPS_npp_d20131208_t0408543_e0417005_b00001_c20131223024038636000_all-_dev.h5
RNSCA-RONPS_npp_d20131208_t0548391_e0558002_b00001_c20131223021256616000_all-_dev.h5
RNSCA-ROTCS_npp_d20131208_t0408168_e0417380_b00001_c20131223024038627000_all-_dev.h5
RNSCA-ROTCS_npp_d20131208_t0548017_e0558002_b00001_c20131223021256603000_all-_dev.h5
RNSCA-RVIRS_npp_d20131208_t0407405_e0417380_b00001_c20131223024038167000_all-_dev.h5
RNSCA-RVIRS_npp_d20131208_t0547150_e0558377_b00001_c20131223021256099000_all-_dev.h5

After sort (my expected results) 排序后(我的预期结果)

RATMS-RNSCA_npp_d20131208_t0408392_e0417432_b00001_c20131223024038606000_all-_dev.h5
RCRIS-RNSCA_npp_d20131208_t0408392_e0417432_b00001_c20131223024038506000_all-_dev.h5
RNSCA-ROLPS_npp_d20131208_t0408334_e0417550_b00001_c20131223024038619000_all-_dev.h5
RNSCA-RONPS_npp_d20131208_t0408543_e0417005_b00001_c20131223024038636000_all-_dev.h5
RNSCA-ROTCS_npp_d20131208_t0408168_e0417380_b00001_c20131223024038627000_all-_dev.h5
RNSCA-RVIRS_npp_d20131208_t0407405_e0417380_b00001_c20131223024038167000_all-_dev.h5

RATMS-RNSCA_npp_d20131208_t0547506_e0557586_b00001_c20131223021256522000_all-_dev.h5
RCRIS-RNSCA_npp_d20131208_t0548226_e0557586_b00001_c20131223021256270000_all-_dev.h5
RNSCA-ROLPS_npp_d20131208_t0548233_e0558223_b00001_c20131223021256591000_all-_dev.h5
RNSCA-RONPS_npp_d20131208_t0548391_e0558002_b00001_c20131223021256616000_all-_dev.h5
RNSCA-ROTCS_npp_d20131208_t0548017_e0558002_b00001_c20131223021256603000_all-_dev.h5
RNSCA-RVIRS_npp_d20131208_t0547150_e0558377_b00001_c20131223021256099000_all-_dev.h5

Please pay attention the 3rd (dYYYYMMdd) and 4th (thhmmssS) fields in each aforesaid file name. 请注意上述每个文件名中的第三(dYYYYMMdd)和第四(thhmmssS)字段。 Prefix letter 'd' means date, and the prefix 't' means time. 前缀字母“ d”表示日期,前缀“ t”表示时间。

NOTE: 'YYYYMMDD' represents the date of the start of the swath (YYYY: 4 digit year; MM: month; DD: day of month). 注意:'YYYYMMDD'表示扫描开始的日期(YYYY:4位数字的年份; MM:月; DD:月的日)。 The first and the second 'hhmmssS' represent the start and end of swath, respectively (hh -hour; mm: minutes; ss: seconds; S: 10th of a second). 第一个和第二个“ hhmmssS”分别代表条带的开始和结束(hh -hour; mm:分钟; ss:秒; S:十分之一秒)。

I think my needs can be met to have the file list to be sorted by using "YYYYMMdd_thh" combination. 我认为可以使用“ YYYYMMdd_thh”组合来排序文件列表,从而满足我的需求。 How could I do that by using the Linux bash script language? 我如何使用Linux bash脚本语言来做到这一点?

Thank you. 谢谢。

GoldenLee GoldenLee

You will love bash. 你会爱上bash的。 Just using the sort command, you can pass a "field delimiter" and select fields on which to sort. 只需使用sort命令,就可以传递“字段定界符”并选择要对其进行排序的字段。

From the man-pages: 从手册页:

-t, --field-separator=SEP use SEP instead of non-blank to blank transition -t,--field-separator = SEP使用SEP而不是非空白到空白的过渡

-k, --key=KEYDEF sort via a key; -k,--key = KEYDEF通过键排序; KEYDEF gives location and type KEYDEF给出位置和类型

KEYDEF is F[.C][OPTS][,F[.C][OPTS]] for start and stop position, where F is a field number and C a character position in the field; KEYDEF是开始位置和结束位置的F [.C] [OPTS] [,F [.C] [OPTS]],其中F是字段编号,C是字段中的字符位置; both are origin 1, and the stop position defaults to the line's end. 两者都是原点1,并且停止位置默认为行的结尾。 If neither -t nor -b is in effect, characters in a field are counted from the beginning of the preceding whitespace. 如果-t和-b均无效,则将从前一个空格的开头开始计算字段中的字符。 OPTS is one or more single-let‐ ter ordering options [bdfgiMhnRrV], which override global ordering options for that key. OPTS是一个或多个单字母排序选项[bdfgiMhnRrV],它将覆盖该密钥的全局排序选项。 If no key is given, use the entire line as the key. 如果没有给出密钥,则使用整行作为密钥。

The needed parameters would become: 所需的参数将变为:

sort -t '_' -k 3,4 you_data_file

So we've divided your data into fields on underscores, and sorted on first the 3th field (the date) and then the fourth (the time). 因此,我们已将您的数据划分为下划线,然后首先在第三个字段(日期)上排序,然后在第四个字段(时间)上排序。 Because you were so kind to have a format for date and time with increasing precision, just alphabetic sorting works out. 由于您非常善于使用日期和时间格式来提高精度,因此只需按字母顺序排序即可。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM