Automatically find files that start with similar strings (and find these strings) using Python

Question

I have a directory with a number of files in a format similar to this:

"ABC_01.dat", "ABC_02.dat", "ABC_03-08.dat", "DEF_13.dat", "DEF_14.dat", "DEF_16.dat", "GHI_09.dat", "GHI_12-14.dat"

etc., you get the idea. Essentially, what I want to do is merge all files whose names start with a similar string. At the moment, I do this by manually setting a variable names = ["ABC", "DEF", "GHI"] , iterating over it ( for name in names ) and getting the respective filenames using glob glob.glob(name + "*.dat") . The merging step is later done using pandas . I don't just need the names/prefixes for finding the files; they are used later in my script to set the output files' names.

Is there a way I can automatically generate the variable names if I know that the files are all in the format name_*.dat ?

Answer 1

Consider this :

names = set([name.rpartition('_')[0] for name in glob('*_*.dat')])

This will get all unique prefixes before '_'. You will also want to set a correct path in glob() before matching.

Answer 2

You can do this:

result = [filter(lambda x:x.startswith(sn), fileNames) for sn in set([i.split('_')[0] for i in glob.glob("*.*")])]
print result

output:

[['ABC_01.dat', 'ABC_02.dat', 'ABC_03-08.dat'], ['GHI_09.dat', 'GHI_12-14.dat'], ['DEF_13.dat', 'DEF_14.dat', 'DEF_16.dat']]

Now, all files from result[0] are to be merged; similarly for result[1],...

Automatically find files that start with similar strings (and find these strings) using Python

Question

2 answers

solution1
1 ACCPTED 2014-11-06 10:19:05

solution2
1 2014-11-06 10:22:10

Automatically find files that start with similar strings (and find these strings) using Python

Question

2 answers

solution1 1 ACCPTED 2014-11-06 10:19:05

solution2 1 2014-11-06 10:22:10

solution1
1 ACCPTED 2014-11-06 10:19:05

solution2
1 2014-11-06 10:22:10