简体   繁体   English

从另一个数据帧的选定列填充 NaN 值

[英]fill NaN values from selected columns of another dataframe

i have df1 like this我有这样的 df1

       id        name  level personality      type  weakness    atk    def     hp  stage
0    53.0     Persian   40.0        mild    normal  fighting  104.0  116.0    NaN    2.0
1   126.0      Magmar   44.0      docile       NaN     water   96.0   83.0  153.0    1.0
2    57.0    Primeape    9.0      lonely  fighting    flying    NaN   66.0   43.0    2.0
3     3.0    Venusaur   44.0       sassy     grass      fire  136.0  195.0   92.0    3.0
4    11.0     Metapod    4.0       naive     grass      fire    NaN  114.0    NaN    2.0
5   126.0      Magmar   96.0      modest      fire     water   62.0  114.0    NaN    1.0
6   137.0     Porygon   96.0     relaxed       NaN  fighting   68.0   50.0  127.0    1.0
7    69.0  Bellsprout   84.0      lonely     grass      fire    NaN    NaN    NaN    1.0
8    10.0    Caterpie    3.0     serious       NaN    flying    NaN    NaN   15.0    1.0
9    12.0  Butterfree   12.0       hasty       NaN    flying   20.0    NaN    NaN    3.0
10   35.0    Clefairy   18.0      impish     fairy    poison   33.0    NaN    NaN    1.0
11   59.0    Arcanine   35.0      gentle      fire     water   45.0   60.0   80.0    2.0
12  111.0     Rhyhorn   31.0     naughty      rock     water   40.0    NaN  175.0    1.0
13  136.0     Flareon   75.0        bold       NaN     water    NaN  143.0    NaN    2.0
14   51.0     Dugtrio   82.0      gentle    ground     water  152.0  161.0  168.0    2.0
15   38.0   Ninetales    5.0       brave      fire     water    NaN  179.0  173.0    2.0
16  102.0   Exeggcute   88.0        rash       NaN      fire    NaN  124.0    NaN    1.0 
........

and df2 as和 df2 作为

    weakness      type  count
3       fire     grass     11
10     water      fire      9
0   fighting    normal      6
4     flying  fighting      3
8     poison     fairy      3
6      grass     water      1
9       rock      fire      1
7     ground  electric      1

I want to update NaN values in type column using the df2 with matching weakness columns in both dfs.我想使用 df2 更新类型列中的 NaN 值,并在两个 dfs 中匹配弱列。 For example in lines 8 and 9 in df1, 'type' values NaN.例如,在 df1 的第 8 行和第 9 行中,'type' 值为 NaN。 I want to update them matching weakness column in df1 with df2.我想用 df2 更新它们匹配 df1 中的弱点列。 So those 8,9 type values should be 'fighting' etc. This is something like a one to many relationship between df2 and df1.所以那些 8,9 类型值应该是“战斗”等。这就像 df2 和 df1 之间的一对多关系。

I tried我试过

df1.update(df2)

and

df1.fillna(df2)

But they didn't give the desired output.但他们没有给出想要的输出。 Any help will be appreciated.任何帮助将不胜感激。

  1. create a Series from df2 , which maps weakness values to type values:df2创建一个系列,它将weakness值映射到type值:

    mapping = df2.set_index("weakness")["type"]

  2. map df1["weakness"] using this mapping to create default values:使用此映射来映射df1["weakness"]以创建默认值:

    defaults = df1["weakness"].map(mapping)

  3. use the defaults as an argument to fillna method:使用默认值作为fillna方法的参数:

    df1["type"] = df1["type"].fillna(defaults)

You can createa a dictionary from df2, with the weakness column as keys and type column as their respective values, and then use that dictionary to fillna the type column in df1 using map :您可以从 df2 创建一个字典,其中弱列作为键,类型列作为它们各自的值,然后使用该字典使用map fillna df1 中的类型列:

m = dict(zip(df2.weakness,df2.type))
df1.type = df1.type.fillna(df1.weakness.map(m))

Prints:印刷:

>>> df1[['weakness','type']]

    weakness      type
0   fighting    normal
1      water      fire
2     flying  fighting
3       fire     grass
4       fire     grass
5      water      fire
6   fighting    normal
7       fire     grass
8     flying  fighting
9     flying  fighting
10    poison     fairy
11     water      fire
12     water      rock
13     water      fire
14     water    ground
15     water      fire
16      fire     grass

Code documented inline内联记录的代码

# Merge both dataframes using "weakness" as key
df = pd.merge(df1, df2[['weakness', 'type']], 
               on="weakness",  suffixes=("", "_y"), how="left")
# Replace nans
df['type'].fillna(df['type_y'], inplace=True)
# Drop additional columns resulted from Merge
df.drop(columns=['type_y'])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM