![](/img/trans.png)
[英]Get column values of a DataFrame if column name matches row value of another DataFrame pandas
[英]Pandas compare all values of a column with different DataFrame and return column name (of a dif. DataFrame) where value matches
我正在尝试根据来自其他数据集的匹配关键字为数据集的每行分配类别。
使用下面的示例,新字符串的所有值都将是非运动的(找到值的 df_TWO 的列名)
df_ONE
Heroes The Punisher
Heroes The Punisher
Heroes Human Torch - 1
Heroes Man Thing
Heroes Medusa
Heroes Mr. Fantastic
Movies-TV Star Wars
Movies-TV Star Wars
df_TWO
sport non_sport gaming
0 baseball movies-tv pokemon
1 basketball music yugioh
2 football people magic
3 hockey history gaming
4 soccer heroes NaN
5 racing NaN NaN
6 boxing NaN NaN
7 golf NaN NaN
8 mma NaN NaN
9 multisport NaN NaN
10 tennis NaN NaN
11 wrestling NaN NaN
12 poker NaN NaN
有这个结果会很高兴:
Heroes The Punisher non-sport
Heroes The Punisher non-sport
Heroes Human Torch - 1 non-sport
Heroes Man Thing non-sport
Heroes Medusa non-sport
Heroes Mr. Fantastic non-sport
Movies-TV Star Wars non-sport
Movies-TV Star Wars non-sport
我试图采用以下解决方案,但没有运气。
变成类似的东西
您需要重塑您的第二个 dataframe。 你可以很容易地用melt
来做到这一点。
这是融化的 df 的示例:
col_match genre
0 sport baseball
1 sport basketball
2 sport football
3 sport hockey
4 sport soccer
5 sport racing
因此,您可以使用融化的 df 加入原始流派。 请务必在第一个 df 中小写您的流派列。
import pandas as pd
import numpy as np
df = pd.DataFrame({
'genre': ['Heroes', 'Heroes', 'Heroes', 'Heroes', 'Heroes', 'Heroes', 'Movies-TV', 'Movies-TV'],
' title': ['The Punisher', 'The Punisher', 'Human Torch - 1', 'Man Thing', 'Medusa', 'Mr. Fantastic', 'Star Wars', 'Star Wars']})
df2 = pd.DataFrame({
'sport': ['baseball', 'basketball', 'football', 'hockey', 'soccer', 'racing', 'boxing', 'golf', 'mma', 'multisport', 'tennis', 'wrestling', 'poker'],
'non_sport': ['movies-tv', 'music', 'people', 'history', 'heroes', np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan],
'gaming': ['pokemon', 'yugioh', 'magic', 'gaming', np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan]})
df['genre'] = df['genre'].str.lower()
df.merge(df2.melt(value_vars=df2.columns, var_name='col_match', value_name='genre'), on='genre')
Output
genre title col_match
0 heroes The Punisher non_sport
1 heroes The Punisher non_sport
2 heroes Human Torch - 1 non_sport
3 heroes Man Thing non_sport
4 heroes Medusa non_sport
5 heroes Mr. Fantastic non_sport
6 movies-tv Star Wars non_sport
7 movies-tv Star Wars non_sport
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.