[英]Extract ID and Date from file name in python
I have this filename as source of data of my dataframe我将此文件名作为我的 dataframe 的数据源
file_name = 2900-ABC Project-20210525-Data 1
and I want to get the 4 first number as a new column called ID
and also the date in the filename as the new column called event_date.我想将第 4 个数字作为一个名为
ID
的新列,并将文件名中的日期作为名为 event_date 的新列。
The expected results would be:预期结果将是:
id event_date
2900 2021-05-25
How can I get it in python?如何在 python 中获得它?
Thanks in advance.提前致谢。
Barring regular expressions, this can be done withstr.split()
:除了正则表达式,这可以通过
str.split()
来完成:
import datetime as dt
import pandas as pd
file_name = '2900-ABC Project-20210525-Data 1'
file_split = file_name.split('-')
id_value = int(file_split[0])
date = dt.datetime.strptime(file_split[2], '%Y%m%d').date()
df = pd.DataFrame(data={'id': [id_value], 'event_date': [date]})
Using str.extract
and str.replace
:使用
str.extract
和str.replace
:
df["id"] = df["file_name"].str.extract(r'^(\d+)')
df["event_date"] = df["file_name"].str.replace(r'^.*-(\d{4})(\d{2})(\d{2})-.*$', r'\1-\2-\3')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.