简体   繁体   中英

Python re.search anomaly

I have a routine that searches through a directory of files and extracts a customer number from the filename:

import os
import re

suffix= '.csv'

# For each file in input folder, extract customer number 
input_list = os.listdir(path_in)
for input_file in input_list:
        fileInput = os.path.join(path_in,input_file)
        customer_ID = re.search('custID_(.+?)'+suffix,fileInput).group(1)
        print(customer_ID)

With suffix='.csv' and a folder full of csv files:

avg_hrly_custID_8147611.csv, avg_hrly_custID_8147612.csv, avg_hrly_custID_8147613.csv ...

I get the expected output:

8147611, 8147612, 8147613...

BUT, with suffix = '.png' and a folder of .png image files,:

yearly_average_plot_custID_8147611.png, yearly_average_plot_custID_8147612.png, yearly_average_plot_custID_8147613.png ...

I get this error:

AttributeError: 'NoneType' object has no attribute 'group'

Why won't it work for image files?

@BrenBarn spotted the cause of the problem. The regex failed because there was a subdirectory in the directory who's name didn't match. I've solved it by introducing try....except

import os
import re

suffix= '.png'

# For each file in input folder, extract customer number 
input_list = os.listdir(path_in)
for input_file in input_list:
        fileInput = os.path.join(path_in,input_file)
        try:
           customer_ID = re.search('custID_(.+?)'+suffix,fileInput).group(1)
           print(customer_ID)
        except:
           pass

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM