简体   繁体   English

Append 到 Dataframe 的列 1 基于 Dataframe 中的匹配列值 2

[英]Append Columns to Dataframe 1 Based on Matching Column Values in Dataframe 2

I have two dataframes, df1 and df2 .我有两个数据框, df1df2 The first of these dataframes tracks the locations (ie, ZIP codes) of specific individuals at different time points:这些数据帧中的第一个跟踪特定个体在不同时间点的位置(即 ZIP 代码):

ID ID ZIP 1 ZIP 1 ZIP 2 ZIP 2 ZIP 3 ZIP 3
1 1个 55333 55333 N/A不适用 55316 55316
2 2个 55114 55114 27265 27265 27265 27265
3 3个 55744 55744 55744 55744 N/A不适用

The second dataframe contains several columns of data pertaining to every ZIP code in the country (many of which do not appear in df1 ):第二个 dataframe 包含与该国家/地区的每个ZIP 代码相关的几列数据(其中许多未出现在df1中):

ZIP ZIP State State Tier等级
01001 01001 MA 1 1个
... ... ... ... ... ...
27265 27265 NC数控 2 2个
55114 55114 MN 4 4个
55316 55316 MN 7 7
55333 55333 MN 5 5个
55744 55744 MN 3 3个

I would like to merge these dataframes and append the variable names from df2 to the ends of the corresponding ZIP/time point variable in df1 to end up with something like this (Note: I removed the ZIP 3 variable for simplicity; I'd still like to append the relevant State and Tier data, as shown for ZIP 1 and ZIP 2):我想将这些数据帧和 append 变量名从df2合并到df1中相应的 ZIP/时间点变量的末尾,以得到类似这样的结果(注意:为了简单起见,我删除了 ZIP 3 变量;我仍然喜欢append相关的State和Tier数据,如图ZIP 1和ZIP 2):

ID ID ZIP 1 ZIP 1 ZIP 2 ZIP 2 ZIP 1 State ZIP 1 State ZIP 2 State ZIP 2 State ZIP 1 Tier ZIP 1 层 ZIP 2 Tier ZIP 2层
1 1个 55333 55333 N/A不适用 MN N/A不适用 5 5个 N/A不适用
2 2个 55114 55114 27265 27265 MN NC数控 4 4个 2 2个
3 3个 55744 55744 55744 55744 MN MN 3 3个 3 3个

The closest solution I have come up with is to create multiple "merged" dataframes by merging on each individual ZIP code variable in df1 .我想出的最接近的解决方案是通过合并df1中的每个单独的 ZIP 代码变量来创建多个“合并”数据帧。 This is obviously less than ideal, and does not resolve the variable naming issue either.这显然不太理想,也没有解决变量命名问题。

merged = pd.merge(df1, df2, left_on = 'ZIP 1', right_on = 'ZIP', how = 'left')
merged2 = pd.merge(df1, df2, left_on = 'ZIP 2', right_on = 'ZIP', how = 'left')
merged3 = pd.merge(df1, df2, left_on = 'ZIP 3', right_on = 'ZIP', how = 'left')

Any guidance would be much appreciated: :-)任何指导将不胜感激::-)

Try something like this:尝试这样的事情:

dfs = df1.set_index('ID').stack().rename('ZIP').reset_index().drop('level_1', axis=1)
dfm = dfs.merge(df2)
df_out =dfm.set_index(['ID', dfm.groupby('ID').cumcount() +1]).unstack()
df_out.columns = [f'{i} {j}' for i, j in df_out.columns]
print(df_out)

Output: Output:

    ZIP 1  ZIP 2  ZIP 3 State 1 State 2 State 3 Tier 1 Tier 2 Tier 3
ID                                                                  
1   55333  55316    NaN      MN      MN     NaN      5      7    NaN
2   55114  27265  27265      MN      NC      NC      4      2      2
3   55744  55744    NaN      MN      MN     NaN      3      3    NaN

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将基于其他列值的列附加到pandas数据框 - How to append columns based on other column values to pandas dataframe How to filter the rows of a dataframe based on the presence of the column values in a separate dataframe and append columns from the second dataframe - How to filter the rows of a dataframe based on the presence of the column values in a separate dataframe and append columns from the second dataframe 将 dataframe 列与列表值匹配,将 append dataframe 与匹配行匹配 - match the dataframe columns with list values and append dataframe with matching rows Pandas:根据不同dataframe的多列中的匹配值,在一个dataframe中创建一列 - Pandas: create a column in one dataframe based on matching values in multiple columns of a different dataframe pandas数据框根据另一数据框中的值将值追加到一列 - pandas dataframe append values to one column based on the values in another dataframe 如何使用基于条件的值将 append 列到 dataframe - How to append a column to a dataframe with values based on condition 仅将匹配的列附加到数据框 - Append only matching columns to dataframe Pandas 根据另一个数据框中的匹配列填充新的数据框列 - Pandas populate new dataframe column based on matching columns in another dataframe 根据不同数据框中的匹配值,将摘要列添加到pandas数据框中 - Add summary columns to a pandas dataframe based on matching values in a different dataframe 根据在另一个 dataframe 中匹配/包含特定列的值过滤 dataframe - Filter a dataframe based on values matching/containing in particular column in another dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM