Python re.search anomaly

Question

I have a routine that searches through a directory of files and extracts a customer number from the filename:

import os
import re

suffix= '.csv'

# For each file in input folder, extract customer number 
input_list = os.listdir(path_in)
for input_file in input_list:
        fileInput = os.path.join(path_in,input_file)
        customer_ID = re.search('custID_(.+?)'+suffix,fileInput).group(1)
        print(customer_ID)

With suffix='.csv' and a folder full of csv files:

avg_hrly_custID_8147611.csv, avg_hrly_custID_8147612.csv, avg_hrly_custID_8147613.csv ...

I get the expected output:

8147611, 8147612, 8147613...

BUT, with suffix = '.png' and a folder of .png image files,:

yearly_average_plot_custID_8147611.png, yearly_average_plot_custID_8147612.png, yearly_average_plot_custID_8147613.png ...

I get this error:

AttributeError: 'NoneType' object has no attribute 'group'

Why won't it work for image files?

Answer 1

@BrenBarn spotted the cause of the problem. The regex failed because there was a subdirectory in the directory who's name didn't match. I've solved it by introducing try....except

import os
import re

suffix= '.png'

# For each file in input folder, extract customer number 
input_list = os.listdir(path_in)
for input_file in input_list:
        fileInput = os.path.join(path_in,input_file)
        try:
           customer_ID = re.search('custID_(.+?)'+suffix,fileInput).group(1)
           print(customer_ID)
        except:
           pass

Python re.search anomaly

Question

1 answers

solution1
0 ACCPTED 2016-08-09 00:56:12

Python re.search anomaly

Question

1 answers

solution1 0 ACCPTED 2016-08-09 00:56:12

solution1
0 ACCPTED 2016-08-09 00:56:12