[英]Pandas joining dataframes with different index levels/datetime?
Hi I have two DataFrames which look like this: 嗨,我有两个看起来像这样的DataFrames:
-------------------------------------------------
| | dineType | menuName | unique | columns |
-------------------------------------------------
| date | | | | |
-------------------------------------------------
|%y%m%d| | | | |
-------------------------------------------------
...
-------------------------------------------------
| | dineDate | dineType | menuName | |
-------------------------------------------------
| 0 | %Y%m%d | | | |
-------------------------------------------------
| 1 | | | | |
-------------------------------------------------
...
I want to join the two dataframes into one output. 我想将两个数据框合并为一个输出。 As you can see, the main problem is that the indexes from each table are different from each other.
如您所见,主要问题是每个表的索引都不同。 I want the output to follow the second table's format.
我希望输出遵循第二张表格的格式。 Also the dates which each table starts from are different.
每个表的起始日期也不同。 How would I join these two dataframes?
我将如何加入这两个数据框?
如果看一下文档 ,它说您可以使用left_on
, right_on
和left_index
, right_index
属性基于数据框中的列和索引进行连接。
pd.merge(df1, df2, left_index=True, right_on='dineDate')
Instead of using string
with specific format for dates, you can use pd.datetime
type; 可以使用
pd.datetime
类型代替日期使用特定格式的string
。 after converting date
and dineDate
column to pd.datetime
type, joining task will work without additional work. 将
date
和dineDate
列转换为pd.datetime
类型后,加入任务将无需进行其他工作即可工作。 You can do that, assuming data comes from CSV file, parse_dates
option of pd.read_csv
. 你能做到这一点,假设数据来自CSV文件,
parse_dates
的选项pd.read_csv
。 For formatting output, you can set option date_format='%Y%m%d' of
pd.DataFrame.to_csv`. 要格式化输出,可以设置pd.DataFrame.to_csv`的
date_format='%Y%m%d' of
选项。
Please provide sample code if you need more details. 如果您需要更多详细信息,请提供示例代码。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.