简体   繁体   中英

Automatically find files that start with similar strings (and find these strings) using Python

I have a directory with a number of files in a format similar to this:

"ABC_01.dat", "ABC_02.dat", "ABC_03-08.dat", "DEF_13.dat", "DEF_14.dat", "DEF_16.dat", "GHI_09.dat", "GHI_12-14.dat"

etc., you get the idea. Essentially, what I want to do is merge all files whose names start with a similar string. At the moment, I do this by manually setting a variable names = ["ABC", "DEF", "GHI"] , iterating over it ( for name in names ) and getting the respective filenames using glob glob.glob(name + "*.dat") . The merging step is later done using pandas . I don't just need the names/prefixes for finding the files; they are used later in my script to set the output files' names.

Is there a way I can automatically generate the variable names if I know that the files are all in the format name_*.dat ?

Consider this :

names = set([name.rpartition('_')[0] for name in glob('*_*.dat')])

This will get all unique prefixes before '_'. You will also want to set a correct path in glob() before matching.

You can do this:

result = [filter(lambda x:x.startswith(sn), fileNames) for sn in set([i.split('_')[0] for i in glob.glob("*.*")])]
print result

output:

[['ABC_01.dat', 'ABC_02.dat', 'ABC_03-08.dat'], ['GHI_09.dat', 'GHI_12-14.dat'], ['DEF_13.dat', 'DEF_14.dat', 'DEF_16.dat']]

Now, all files from result[0] are to be merged; similarly for result[1],...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM