I have a directory with a number of files in a format similar to this:
"ABC_01.dat", "ABC_02.dat", "ABC_03-08.dat", "DEF_13.dat", "DEF_14.dat", "DEF_16.dat", "GHI_09.dat", "GHI_12-14.dat"
etc., you get the idea. Essentially, what I want to do is merge all files whose names start with a similar string. At the moment, I do this by manually setting a variable names = ["ABC", "DEF", "GHI"]
, iterating over it ( for name in names
) and getting the respective filenames using glob glob.glob(name + "*.dat")
. The merging step is later done using pandas
. I don't just need the names/prefixes for finding the files; they are used later in my script to set the output files' names.
Is there a way I can automatically generate the variable names
if I know that the files are all in the format name_*.dat
?
Consider this :
names = set([name.rpartition('_')[0] for name in glob('*_*.dat')])
This will get all unique prefixes before '_'. You will also want to set a correct path in glob() before matching.
You can do this:
result = [filter(lambda x:x.startswith(sn), fileNames) for sn in set([i.split('_')[0] for i in glob.glob("*.*")])]
print result
output:
[['ABC_01.dat', 'ABC_02.dat', 'ABC_03-08.dat'], ['GHI_09.dat', 'GHI_12-14.dat'], ['DEF_13.dat', 'DEF_14.dat', 'DEF_16.dat']]
Now, all files from result[0] are to be merged; similarly for result[1],...
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.