[英]Merging Pandas Dataframe to Multiindex with Date
I have some dataframes with a date index from multiple sources which I want to merge into a single multiindex dataframe. 我有一些来自多个来源的带有日期索引的数据框,我想合并到一个多索引数据框中。 I'm struggling to figure out how to do this.
我正在努力弄清楚该如何做。
Starting with two dataframes: 从两个数据帧开始:
Source 1 来源1
+---------------------+------+------+-----+-------+
| date | open | high | low | close |
+---------------------+------+------+-----+-------+
| 2018-04-04 20:00:00 | xxx | xxx | xxx | xxx |
| 2018-04-04 21:00:00 | xxx | xxx | xxx | xxx |
| 2018-04-04 22:00:00 | xxx | xxx | xxx | xxx |
+---------------------+------+------+-----+-------+
Source 2 来源2
+---------------------+------+------+-----+-------+
| date | open | high | low | close |
+---------------------+------+------+-----+-------+
| 2018-04-04 20:00:00 | xxx | xxx | xxx | xxx |
| 2018-04-04 21:00:00 | xxx | xxx | xxx | xxx |
| 2018-04-04 22:00:00 | xxx | xxx | xxx | xxx |
+---------------------+------+------+-----+-------+
I'd like to merge them so they are multiindexed on the date with the source1 or source2. 我想合并它们,以便它们在日期与source1或source2上建立多索引。
Something like: 就像是:
+---------------------+---------+------+-----+-------+
| | | | | |
+---------------------+---------+------+-----+-------+
| 2018-04-04 20:00:00 | source1 | | | |
| | open | high | low | close |
| | xxx | xxx | xxx | xxx |
| | source2 | | | |
| | open | high | low | close |
| | xxx | xxx | xxx | xxx |
| 2018-04-04 21:00:00 | source1 | | | |
| | open | high | low | close |
| | xxx | xxx | xxx | xxx |
| | source2 | | | |
| | open | high | low | close |
| | xxx | xxx | xxx | xxx |
| 2018-04-04 22:00:00 | source1 | | | |
| | open | high | low | close |
| | xxx | xxx | xxx | xxx |
| | source2 | | | |
| | open | high | low | close |
| | xxx | xxx | xxx | xxx |
+---------------------+---------+------+-----+-------+
Can anyone help? 有人可以帮忙吗?
Thanks! 谢谢!
You can go for concat
specifying the keys ie 您可以为
concat
指定密钥,例如
df3 = pd.concat([df1,df2],keys=['source1','source2']).reset_index(level=0)
df3 = df3.set_index(['date','level_0']).sort_index(level='date')
open high low close
date level_0
2018-04-04 20:00:00 source1 xxx xxx xxx xxx
source2 xxx xxx xxx xxx
2018-04-04 21:00:00 source1 xxx xxx xxx xxx
source2 xxx xxx xxx xxx
2018-04-04 22:00:00 source1 xxx xxx xxx xxx
source2 xxx xxx xxx xxx
Use concat
with keys
and set_index
for DatetimeIndex
and then swaplevel
with sort_index
: 将
concat
与keys
和set_index
用于DatetimeIndex
,然后将swaplevel
与sort_index
swaplevel
使用:
df = (pd.concat([df1.set_index('date'),df2.set_index('date')], keys=['source1','source2'])
.swaplevel(0,1)
.sort_index())
print (df)
open high low close
date
2018-04-04 20:00:00 source1 xxx xxx xxx xxx
source2 xxx xxx xxx xxx
2018-04-04 21:00:00 source1 xxx xxx xxx xxx
source2 xxx xxx xxx xxx
2018-04-04 22:00:00 source1 xxx xxx xxx xxx
source2 xxx xxx xxx xxx
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.