Need some help on a requirement to extract date from csv filename and load into a column.
input file = ABC_XYZ_EXPORT-20170101.csv , ABC_XYZ_EXPORT-20170102.csv
I am able to read both the files in loop , but the date is extracted just once and is static for all records in two different files. I am not sure , but this could be very well because of incorrect loop. Please help. Thanks in advance.
for input_file in allFiles:
exc_date = input_file
exc_date = re.sub('ABC_XYZ_EXPORT-+([0-9]+)[.]csv$', r'\1', exc_date)
#print(exc_date)
#PD pandas dataframe
for d in exc_date:
csv_input = pd.concat((pd.read_csv(f) for f in allFiles))
csv_input['Load_date'] = exc_date
csv_input.to_csv('outputpd.csv')
IIUC, you need to read data from multiple files and assign a Load_Date column to that with its date from file name.
allFiles = ['ABC_XYZ_EXPORT-20170101.csv' , 'ABC_XYZ_EXPORT-20170102.csv']
csv_input =pd.DataFrame()
for input_file in allFiles:
#Loop through each file
exc_date = input_file
exc_date = re.sub('ABC_XYZ_EXPORT-+([0-9]+)[.]csv$', r'\1', exc_date)
df=pd.read_csv(input_file)
df['Load_date'] = exc_date #Add date for that file alone
csv_input.append(df) # append to previously read data
csv_input.to_csv('outputpd.csv') #Creates a single output file with contents from all files.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.