I need to extract the event date written on the filename to be in a new column called event_date, I am assumed I can use regex but I still do not get the exact formula to implement.
The filename is written below
file_name = X-Y Cable Installment Monitoring (10-7-20).xlsx
The (10-7-20) is in mm-dd-yy format.
I expect the date would result df['event_date'] = 2020-10-07
How should I write my script to get the correct date from the filename.
Thanks in advance.
use str.rsplit()
with datetime module
-
Steps -
from datetime import datetime
file_name = 'X-Y Cable Installment Monitoring (10-7-20).xlsx'
date = file_name.rsplit('(')[1].rsplit(')')[0] # '10-7-20'
date = datetime.strptime(date, "%m-%d-%y").strftime('%Y-%m-%d') # '2020-10-07'
Or via regex
-
import re
regex = re.compile(r"(\d{1,2}-\d{1,2}-\d{2})") # pattern to capture date
matchArray = regex.findall(file_name)
date = matchArray[0]
date = datetime.strptime(date, "%m-%d-%y").strftime('%Y-%m-%d')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.