简体   繁体   English

用字典合并两个熊猫数据框

[英]Merging two pandas dataframes with a dictionary

I have a dict that currently looks like this: 我有一个字典,目前看起来像这样:

raw_data = {'Series_Date':['2017-03-10','2017-03-11','2017-03-12','2017-03-13','2017-03-14','2017-03-15'],'Value':[1,1,1,1,1,1]}
import pandas as pd
df1= pd.DataFrame(raw_data,columns=['Series_Date','Value'])
raw_data_ = {'Series_Date':['2017-03-16','2017-03-17','2017-03-18','2017-03-19','2017-03-20','2017-03-21'],'Value':[1,1,1,1,1,1]}
df2= pd.DataFrame(raw_data_,columns=['Series_Date','Value'])
dict = {}
dict = {'Check':df1, 'Check2': df2}
print dict

I want to find a way to merge the two dataframe series in my dict in such a manner that the key is the key of the first series and the values are the dataframes merged. 我想找到一种在我的字典中合并两个数据框系列的方法,其中键是第一个系列的键,值是合并的数据框。 My resulting dict should then look like: 我的结果字典应该如下所示:

raw_data = {'Series_Date':['2017-03-10','2017-03-11','2017-03-12','2017-03-13','2017-03-14','2017-03-15','2017-03-16','2017-03-17','2017-03-18','2017-03-19','2017-03-20','2017-03-21'],'Value':[1,1,1,1,1,1,1,1,1,1,1,1]}
import pandas as pd
df= pd.DataFrame(raw_data,columns=['Series_Date','Value'])
dict = {}
dict = {'Check':df}
print dict

Is there any simple way of doing this? 有没有简单的方法可以做到这一点?

You can merge the two frames, and write over your dictionary. 您可以合并两个框架,并覆盖字典。

df_merged = pd.concat(dict.values()).sort_values(by='Series_Date').reset_index(drop=True)
dict = {dict.keys()[0] : df_merged}

However the 'first' information is lost in a dictionary as the keys are not ordered. 但是,“第一”信息在字典中会丢失,因为键没有排序。 So the dict.keys()[0] will not necessarily give you the first data frame's key. 因此dict.keys()[0]不一定会为您提供第一个数据帧的密钥。 You can use an OrderedDict to cope with this issue. 您可以使用OrderedDict来解决此问题。

Then it would go like this: 然后它会像这样:

raw_data = {'Series_Date':['2017-03-10','2017-03-11','2017-03-12','2017-03-13','2017-03-14','2017-03-15'],'Value':[1,1,1,1,1,1]}
import pandas as pd
from collections import OrderedDict
df1= pd.DataFrame(raw_data,columns=['Series_Date','Value'])
raw_data_ = {'Series_Date':['2017-03-16','2017-03-17','2017-03-18','2017-03-19','2017-03-20','2017-03-21'],'Value':[1,1,1,1,1,1]}
df2= pd.DataFrame(raw_data_,columns=['Series_Date','Value'])
dict = OrderedDict([('Check', df1), ('Check2', df2)])

df_merged = pd.concat(dict.values()).sort_values(by='Series_Date').reset_index(drop=True)
dict = {dict.keys()[0] : df_merged}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM