[英]How to get a value from a column in another dataframe (df2), based on ID and on a value that points to the name of the column in df2. Python/Pandas
So I have this df1:所以我有这个df1:
ID State
1 AA
2 AA
3 ZF
3 CJ
and df2:和df2:
ID AA ZF CJ etc
1 9 8 77
2 7 6 5
3 8 88 6
I have to create a new column in df1 bringing the values in df2 like this:我必须在 df1 中创建一个新列,使 df2 中的值如下所示:
ID State Value
1 AA 9
2 AA 7
3 ZF 88
3 CJ 6
I've been trying for 2 hours now and I can't seem to find a way to refer to the column names on df2 based on the values of df1['State'].我已经尝试了 2 个小时,但我似乎无法找到一种方法来根据 df1['State'] 的值来引用 df2 上的列名。 Even if I could think of a way to do that, the value is filtered by ID too... tricky stuff.
即使我能想到一种方法来做到这一点,该值也会被 ID 过滤......棘手的东西。 Any help?
有什么帮助吗?
Thank you in advance先感谢您
You can melt()
on ID
and merge()
with df1
:您可以在
ID
上使用melt()
并使用df1
进行merge()
:
df1 = df1.merge(df2.melt('ID', var_name='State', value_name='Value'))
# ID State Value
# 0 1 AA 9
# 1 2 AA 7
# 2 3 ZF 88
# 3 3 CJ 6
This is slower and more brittle, but if you set df2
's index to ID
, you can use loc[]
in an apply()
:这更慢更脆弱,但如果将
df2
的索引设置为ID
,则可以在apply()
中使用loc[]
:
df2 = df2.set_index('ID')
df1['Value'] = df1.apply(lambda x: df2.loc[x.ID, x.State], axis=1)
Let's try something like:让我们尝试一下:
import pandas as pd
df1 = pd.DataFrame({'ID': {0: 1, 1: 2, 2: 3, 3: 3},
'State': {0: 'AA', 1: 'AA',
2: 'ZF', 3: 'CJ'}})
df2 = pd.DataFrame({'ID': {0: 1, 1: 2, 2: 3},
'AA': {0: 9, 1: 7, 2: 8},
'ZF': {0: 8, 1: 6, 2: 88},
'CJ': {0: 77, 1: 5, 2: 6}})
merged = df1.merge(
df2.set_index('ID')
.stack()
.reset_index()
.rename(columns={'level_1': 'State', 0: 'Value'}),
on=['ID', 'State']
)
print(merged.to_string(index=False))
merged
: merged
:
ID State Value
1 AA 9
2 AA 7
3 ZF 88
3 CJ 6
Uses stack to get each value in df2
into its own row:使用 stack 将
df2
中的每个值放入自己的行中:
print(df2.set_index('ID')
.stack()
.reset_index()
.rename(columns={'level_1': 'State', 0: 'Value'}))
Output: Output:
ID State Value
0 1 AA 9
1 1 ZF 8
2 1 CJ 77
3 2 AA 7
4 2 ZF 6
5 2 CJ 5
6 3 AA 8
7 3 ZF 88
8 3 CJ 6
Then this easily merges with df1
然后这很容易与
df1
合并
Here is an option using loc
这是使用
loc
的选项
df1['value'] = df2.set_index('ID').stack().loc[(pd.MultiIndex.from_frame(df1))].to_numpy()
Since you want to map the columns of the second DataFrame to a row in the first DataFrame, you need to first Transpose the second DataFrame, I also suggest removing the 'ID' column for ease: Since you want to map the columns of the second DataFrame to a row in the first DataFrame, you need to first Transpose the second DataFrame, I also suggest removing the 'ID' column for ease:
df2.drop('ID', axis = 1, inplace = True)
df2 = df2.T
df2.columns = ['State', 'Value1', 'Value2', 'Value3']
final_df = pd.merge(df1, df2, on = 'State', how = 'left')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.