[英]How do I merge two datasets with on BusinessID and get the final dataset?
[英]How Do I Merge These Two Datasets?
我有兩個數據集。 我想使用索引進行合並。
第一個數據集:
index A B C
01/01/2010 15 20 30
15/01/2010 12 15 25
17/02/2010 14 13 35
19/02/2010 11 10 22
2nt 數據集:
index year month price
0 2010 january 70
1 2010 february 80
我希望他們像這樣加入:
index A B C price
01/01/2010 15 20 30 70
15/01/2010 12 15 25 70
17/02/2010 14 13 35 80
19/02/2010 11 10 22 80
問題是如何使用兩列(第二個數據集的year
和month
)來創建臨時日期時間index
。
試試這個,通過從df1
中提取.month_name()和 year( .dt.year
) 並將其與df2
合並
>>> df1
index A B C
0 01/01/2010 15 20 30
1 15/01/2010 12 15 25
2 17/02/2010 14 13 35
3 19/02/2010 11 10 22
>>> df2
index year month price
0 0 2010 january 70
1 1 2010 february 80
# merging df1 and df2 by month and year.
>>> df1.merge(df2,
left_on = [pd.to_datetime(df1['index']).dt.year,
pd.to_datetime(df1['index']).dt.month_name().str.lower()],
right_on = ['year', 'month'])
Output:
index_x A B C index_y year month price
0 01/01/2010 15 20 30 0 2010 january 70
1 15/01/2010 12 15 25 0 2010 january 70
2 17/02/2010 14 13 35 1 2010 february 80
3 19/02/2010 11 10 22 1 2010 february 80
這是愚蠢的答案:我相信你可以做得比這更聰明,)但這有效。 考慮到您的表格是字典列表(您可以輕松地將 SQL 表格轉換為這種格式),我知道這不是一個干凈的解決方案,但您要求一個簡單的解決方案:可能這是最容易理解的:)
months = {'january': "01",
'february': "02",
'march': "03",
'april':"04",
'may': "05",
'june': "06",
'july': "07",
'august': "08",
'september': "09",
'october': "10",
'november': "11",
'december': "12"}
table1 = [{'index': '01/01/2010', 'A': 15, 'B': 20, 'C': 30},
{'index': '15/01/2010', 'A': 12, 'B': 15, 'C': 25},
{'index': '17/02/2010', 'A': 14, 'B': 13, 'C': 35},
{'index': '19/02/2010', 'A': 11, 'B': 10, 'C': 22}]
table2 = [{'index': 0, 'year': 2010, 'month': 'january', 'price':70},
{'index': 1, 'year': 2010, 'month': 'february', 'price':80}]
def joiner(table1, table2):
for row in table2:
row['tempDate'] = "{0}/{1}".format(months[row['month']], str(row['year']))
for row in table1:
row['tempDate'] = row['index'][3:]
table3 = []
for row1 in table1:
row3 = row1.copy()
for row2 in table2:
if row2['tempDate'] == row1['tempDate']:
row3['price'] = row2['price']
break
table3.append(row3)
return(table3)
table3 = joiner(table1, table2)
print(table3)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.