简体   繁体   中英

Python Pandas: Group files in a directory by similar filenames and concatenate dataframes in a specific order

My goal is to group .csv files in a directory by shared characteristics in the file name. My directory contains files with names:

  • After_Source1_Receiver1.csv
  • After_Source1_Receiver2.csv
  • Before_Source1_Receiver1.csv
  • Before_Source1_Receiver2.csv
  • During1_Source1_Receiver1.csv
  • During1_Source1_Receiver2.csv
  • During2_Source1_Receiver1.csv
  • During2_Source1_Receiver2.csv

I would like to sort these files into groups on the numbers following the "Source" and "Receiver" sections of the file name (as shown below) so I can later concatenate them.

Group 1

  • Before_Source1_Receiver1.csv
  • During1_Source1_Receiver1.csv
  • During2_Source1_Receiver1.csv
  • After_Source1_Receiver1.csv

Group 2

  • Before_Source1_Receiver2.csv
  • During1_Source1_Receiver2.csv
  • During2_Source1_Receiver2.csv
  • After_Source1_Receiver2.csv

Any ideas?

It says you want to do this in pandas so here is a pandas solution.

fnames = ['After_Source1_Receiver1.csv',
          'After_Source1_Receiver2.csv',
          'Before_Source1_Receiver1.csv',
          'Before_Source1_Receiver2.csv',
          'During1_Source1_Receiver1.csv',
          'During1_Source1_Receiver2.csv',
          'During2_Source1_Receiver1.csv',
          'During2_Source1_Receiver2.csv']

df = pd.DataFrame(fnames, columns=['names'])

I don't know what you want to do with your end results but this is how you group them.

pattern = r'Source(\d+)_Receiver(\d+)'
for _, g in pd.concat([df, df['names'].str.extract(pattern)], axis=1).groupby([0,1]):
    print(g.names)

0      After_Source1_Receiver1.csv
2     Before_Source1_Receiver1.csv
4    During1_Source1_Receiver1.csv
6    During2_Source1_Receiver1.csv
Name: names, dtype: object
1      After_Source1_Receiver2.csv
3     Before_Source1_Receiver2.csv
5    During1_Source1_Receiver2.csv
7    During2_Source1_Receiver2.csv
Name: names, dtype: object

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM