简体   繁体   English

如何根据 ID 和指向 df2 中列名称的值从另一个 dataframe (df2) 中的列中获取值。 蟒蛇/熊猫

[英]How to get a value from a column in another dataframe (df2), based on ID and on a value that points to the name of the column in df2. Python/Pandas

So I have this df1:所以我有这个df1:

ID    State
1      AA
2      AA
3      ZF
3      CJ

and df2:和df2:

ID    AA    ZF  CJ  etc
1     9      8  77
2     7      6   5
3     8     88   6

I have to create a new column in df1 bringing the values in df2 like this:我必须在 df1 中创建一个新列,使 df2 中的值如下所示:

ID    State    Value
1      AA       9
2      AA       7
3      ZF       88
3      CJ       6

I've been trying for 2 hours now and I can't seem to find a way to refer to the column names on df2 based on the values of df1['State'].我已经尝试了 2 个小时,但我似乎无法找到一种方法来根据 df1['State'] 的值来引用 df2 上的列名。 Even if I could think of a way to do that, the value is filtered by ID too... tricky stuff.即使我能想到一种方法来做到这一点,该值也会被 ID 过滤......棘手的东西。 Any help?有什么帮助吗?

Thank you in advance先感谢您

You can melt() on ID and merge() with df1 :您可以在ID上使用melt()并使用df1进行merge()

df1 = df1.merge(df2.melt('ID', var_name='State', value_name='Value'))

#    ID State  Value
# 0   1    AA      9
# 1   2    AA      7
# 2   3    ZF     88
# 3   3    CJ      6

This is slower and more brittle, but if you set df2 's index to ID , you can use loc[] in an apply() :这更慢更脆弱,但如果将df2的索引设置为ID ,则可以在apply()中使用loc[]

df2 = df2.set_index('ID')
df1['Value'] = df1.apply(lambda x: df2.loc[x.ID, x.State], axis=1)

Let's try something like:让我们尝试一下:

import pandas as pd

df1 = pd.DataFrame({'ID': {0: 1, 1: 2, 2: 3, 3: 3},
                    'State': {0: 'AA', 1: 'AA',
                              2: 'ZF', 3: 'CJ'}})
df2 = pd.DataFrame({'ID': {0: 1, 1: 2, 2: 3},
                    'AA': {0: 9, 1: 7, 2: 8},
                    'ZF': {0: 8, 1: 6, 2: 88},
                    'CJ': {0: 77, 1: 5, 2: 6}})

merged = df1.merge(
    df2.set_index('ID')
        .stack()
        .reset_index()
        .rename(columns={'level_1': 'State', 0: 'Value'}),
    on=['ID', 'State']
)

print(merged.to_string(index=False))

merged : merged

ID State  Value
 1    AA      9
 2    AA      7
 3    ZF     88
 3    CJ      6

Uses stack to get each value in df2 into its own row:使用 stack 将df2中的每个值放入自己的行中:

print(df2.set_index('ID')
        .stack()
        .reset_index()
        .rename(columns={'level_1': 'State', 0: 'Value'}))

Output: Output:

   ID State  Value
0   1    AA      9
1   1    ZF      8
2   1    CJ     77
3   2    AA      7
4   2    ZF      6
5   2    CJ      5
6   3    AA      8
7   3    ZF     88
8   3    CJ      6

Then this easily merges with df1然后这很容易与df1合并

Here is an option using loc这是使用loc的选项

df1['value'] = df2.set_index('ID').stack().loc[(pd.MultiIndex.from_frame(df1))].to_numpy()

Since you want to map the columns of the second DataFrame to a row in the first DataFrame, you need to first Transpose the second DataFrame, I also suggest removing the 'ID' column for ease: Since you want to map the columns of the second DataFrame to a row in the first DataFrame, you need to first Transpose the second DataFrame, I also suggest removing the 'ID' column for ease:

df2.drop('ID', axis = 1, inplace = True)
df2 = df2.T
df2.columns = ['State', 'Value1', 'Value2', 'Value3']

final_df = pd.merge(df1, df2, on = 'State', how = 'left')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如果 df1 column1 中的值与列表中的值匹配,Pandas 从另一个 df1 column2 在 df2 中创建新列 - Pandas create new column in df2 from another df1 column2 if a value in df1 column1 matches value in a list 当 df1 中的键列与 df2 中的多个列匹配时,使用另一个数据框 (df2) 列中的值更新数据框 (df1) 列 - Update a dataframe(df1) column with value from another dataframe(df2) column when a key column in df1 matches to multiple columns in df2 根据另一个df python pandas更新df列值 - update df column value based on another df python pandas Pandas:如果df1列的值在df2列的列表中,则加入 - Pandas: Join if value of df1 column is in list of df2 column pandas 如何从 df2 获取 df1 的值,而 df1 和 df2 的值在列上重叠 - pandas how to get values from df2 for df1 while df1 and df2 have values overlapped on column(s) 如果列值不在 df2 列中,则获取 df1 的行 - Get row of df1 if column value not in column df2 如果df2索引中的df1索引,熊猫会更新列值 - Pandas update column value if df1 index in df2 index Python - 检查df2列中是否存在df1列中的值 - Python - Check if a value in a df1 column is present in df2 column 根据 df2 中的条件查找 df1 中的列值 - Looking up column value in df1 based on criteria in df2 根据 df1 中的值在 df2 中保留一列 - Keep one column in df2 based on value in df1
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM