[英]Get column values of a DataFrame if column name matches row value of another DataFrame pandas
[英]Pandas compare all values of a column with different DataFrame and return column name (of a dif. DataFrame) where value matches
我正在嘗試根據來自其他數據集的匹配關鍵字為數據集的每行分配類別。
使用下面的示例,新字符串的所有值都將是非運動的(找到值的 df_TWO 的列名)
df_ONE
Heroes The Punisher
Heroes The Punisher
Heroes Human Torch - 1
Heroes Man Thing
Heroes Medusa
Heroes Mr. Fantastic
Movies-TV Star Wars
Movies-TV Star Wars
df_TWO
sport non_sport gaming
0 baseball movies-tv pokemon
1 basketball music yugioh
2 football people magic
3 hockey history gaming
4 soccer heroes NaN
5 racing NaN NaN
6 boxing NaN NaN
7 golf NaN NaN
8 mma NaN NaN
9 multisport NaN NaN
10 tennis NaN NaN
11 wrestling NaN NaN
12 poker NaN NaN
有這個結果會很高興:
Heroes The Punisher non-sport
Heroes The Punisher non-sport
Heroes Human Torch - 1 non-sport
Heroes Man Thing non-sport
Heroes Medusa non-sport
Heroes Mr. Fantastic non-sport
Movies-TV Star Wars non-sport
Movies-TV Star Wars non-sport
我試圖采用以下解決方案,但沒有運氣。
變成類似的東西
您需要重塑您的第二個 dataframe。 你可以很容易地用melt
來做到這一點。
這是融化的 df 的示例:
col_match genre
0 sport baseball
1 sport basketball
2 sport football
3 sport hockey
4 sport soccer
5 sport racing
因此,您可以使用融化的 df 加入原始流派。 請務必在第一個 df 中小寫您的流派列。
import pandas as pd
import numpy as np
df = pd.DataFrame({
'genre': ['Heroes', 'Heroes', 'Heroes', 'Heroes', 'Heroes', 'Heroes', 'Movies-TV', 'Movies-TV'],
' title': ['The Punisher', 'The Punisher', 'Human Torch - 1', 'Man Thing', 'Medusa', 'Mr. Fantastic', 'Star Wars', 'Star Wars']})
df2 = pd.DataFrame({
'sport': ['baseball', 'basketball', 'football', 'hockey', 'soccer', 'racing', 'boxing', 'golf', 'mma', 'multisport', 'tennis', 'wrestling', 'poker'],
'non_sport': ['movies-tv', 'music', 'people', 'history', 'heroes', np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan],
'gaming': ['pokemon', 'yugioh', 'magic', 'gaming', np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan]})
df['genre'] = df['genre'].str.lower()
df.merge(df2.melt(value_vars=df2.columns, var_name='col_match', value_name='genre'), on='genre')
Output
genre title col_match
0 heroes The Punisher non_sport
1 heroes The Punisher non_sport
2 heroes Human Torch - 1 non_sport
3 heroes Man Thing non_sport
4 heroes Medusa non_sport
5 heroes Mr. Fantastic non_sport
6 movies-tv Star Wars non_sport
7 movies-tv Star Wars non_sport
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.