[英]How to convert year, month, day, hour/minute columns into a single datetime column?
I have the following data format with different columns for year, month, day, and hour_minute (the first two digits are hour and the last two digits are minutes).我有以下数据格式,其中包含年、月、日和 hour_minute 的不同列(前两位数字是小时,后两位数字是分钟)。 How do I create a new column in datetime format by combining all of these existing columns?如何通过组合所有这些现有列来创建日期时间格式的新列?
YEAR年 | MONTH月 | DAY天 | HOUR_MINUTE HOUR_MINUTE |
---|---|---|---|
2015 2015年 | 1 1个 | 15 15 | 0010 0010 |
2015 2015年 | 1 1个 | 2 2个 | 0020 0020 |
2015 2015年 | 1 1个 | 15 15 | 0045 0045 |
2015 2015年 | 1 1个 | 15 15 | 2110 2110 |
2015 2015年 | 10 10 | 21 21 | 2359 2359 |
I have tried the following but have no luck.我尝试了以下但没有运气。 Thank you for your advise your advise.谢谢你的建议。
df["new_column"]= pd.to_datetime(df[["YEAR", "MONTH", "DAY","HOUR_MINUTE"]])
You need to split HOUR_MINUTE column to HOUR and MINUTE您需要将 HOUR_MINUTE 列拆分为 HOUR 和 MINUTE
df["HOUR"] = df["HOUR_MINUTE"].str[0:2]
df["MINUTE"] = df.pop("HOUR_MINUTE").str[2:4]
df["new_column"] = pd.to_datetime(df[["YEAR", "MONTH", "DAY", "HOUR", "MINUTE"]], format="%Y-%m-%d %H:%M")
print(df)
Output: Output:
YEAR MONTH DAY HOUR MINUTE new_column
0 2015 1 15 00 10 2015-01-15 00:10:00
1 2015 1 2 00 20 2015-01-02 00:20:00
2 2015 1 15 00 45 2015-01-15 00:45:00
3 2015 1 15 21 10 2015-01-15 21:10:00
4 2015 10 21 23 59 2015-10-21 23:59:00
You can apply on entire df if you have only year,month and hour_minute columns like this如果你只有像这样的年、月和小时分钟列,你可以申请整个 df
df.apply(lambda row: pd.to_datetime(''.join(row.values.astype(str)), format="%Y%m%d%H%M") ,axis=1)
Out[198]:
0 2015-11-05 00:10:00
1 2015-01-20 02:00:00
2 2015-11-05 04:05:00
3 2015-11-05 21:10:00
4 2015-10-21 23:59:00
dtype: datetime64[ns]
if there are other columns as well then just select the required columns then apply如果还有其他列,则只需 select 所需的列,然后应用
df[['YEAR', 'MONTH', 'DAY', 'HOUR_MINUTE']].apply(lambda row: pd.to_datetime(''.join(row.values.astype(str)), format="%Y%m%d%H%M") ,axis=1)
Out[201]:
0 2015-11-05 00:10:00
1 2015-01-20 02:00:00
2 2015-11-05 04:05:00
3 2015-11-05 21:10:00
4 2015-10-21 23:59:00
dtype: datetime64[ns]
if you want new_column to be assigned to df then如果你想将 new_column 分配给 df 那么
df['new_column'] = df[['YEAR', 'MONTH', 'DAY', 'HOUR_MINUTE']].apply(lambda row: pd.to_datetime(''.join(row.values.astype(str)), format="%Y%m%d%H%M") ,axis=1)
df
Out[205]:
YEAR MONTH DAY HOUR_MINUTE new_column
0 2015 1 15 0010 2015-11-05 00:10:00
1 2015 1 2 0020 2015-01-20 02:00:00
2 2015 1 15 45 2015-11-05 04:05:00
3 2015 1 15 2110 2015-11-05 21:10:00
4 2015 10 21 2359 2015-10-21 23:59:00
Suggested script建议脚本
import pandas as pd
df1 = pd.DataFrame({'YEAR': ['2015', '2015', '2015', '2015', '2015'],
'MONTH': ['1', '1', '1', '1', '10'],
'DAY': ['15', '2', '15', '15', '21'],
'HOUR_MINUTE': ['0010', '0020', '0045', '2110', '2359']
})
df1['FMT'] = df1.agg('-'.join(['{0[%s]}'%c for c in df1.columns]).format, axis=1)
df1['FMT'] = pd.to_datetime(df1['FMT'], format='%Y-%m-%d-%H%M')
print(df1)
Output Output
YEAR MONTH DAY HOUR_MINUTE FMT
0 2015 1 15 0010 2015-01-15 00:10:00
1 2015 1 2 0020 2015-01-02 00:20:00
2 2015 1 15 0045 2015-01-15 00:45:00
3 2015 1 15 2110 2015-01-15 21:10:00
4 2015 10 21 2359 2015-10-21 23:59:00
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.