简体   繁体   English

根据另一个数据框中的列填充数据框中的空值

[英]Fill empty values in a dataframe based on columns in another dataframe

I have a dataframe df1 like this. 我有一个像这样的数据df1

在此处输入图片说明

I want to fill the nan and the number 0 in column score with mutiple values in another dataframe df2 according to the different names. 我想根据不同的名称在另一个数据df2中用多个值填充列scorenan和数字0

在此处输入图片说明

How could I do this? 我该怎么办?

Option 1 选项1
Short version 简洁版本

df1.score = df1.score.mask(df1.score.eq(0)).fillna(
    df1.name.map(df2.set_index('name').score)
)
df1

  name  score
0    A   10.0
1    B   32.0
2    A   10.0
3    C   30.0
4    B   20.0
5    A   45.0
6    A   10.0
7    A   10.0

Option 2 选项2
Interesting version using searchsorted . 使用searchsorted有趣版本。 df2 must be sorted by 'name' . df2必须按'name'排序。

i = np.where(np.isnan(df1.score.mask(df1.score.values == 0).values))[0]
j = df2.name.values.searchsorted(df1.name.values[i])
df1.score.values[i] = df2.score.values[j]
df1

  name  score
0    A   10.0
1    B   32.0
2    A   10.0
3    C   30.0
4    B   20.0
5    A   45.0
6    A   10.0
7    A   10.0

If df1 and df2 are your dataframes, you can create a mapping and then call pd.Series.replace : 如果df1df2是您的数据帧,则可以创建一个映射,然后调用pd.Series.replace

df1 = pd.DataFrame({'name' : ['A', 'B', 'A', 'C', 'B', 'A', 'A', 'A'], 
                     'score': [0, 32, 0, np.nan, np.nan, 45, np.nan, np.nan]})
df2 = pd.DataFrame({'name' : ['A', 'B', 'C'], 'score' : [10, 20, 30]})

print(df1)

  name  score
0    A    0.0
1    B   32.0
2    A    0.0
3    C    NaN
4    B    NaN
5    A   45.0
6    A    NaN
7    A    NaN

print(df2) 

  name  score
0    A     10
1    B     20
2    C     30

mapping = dict(df2.values)

df1.loc[(df1.score.isnull()) | (df1.score == 0), 'score'] =\
               df1[(df1.score.isnull()) | (df1.score == 0)].name.replace(mapping)

print(df1)

  name  score
0    A   10.0
1    B   32.0
2    A   10.0
3    C   30.0
4    B   20.0
5    A   45.0
6    A   10.0
7    A   10.0

Or using merge , fillna 或使用mergefillna

import pandas as pd
import numpy as np

df1.loc[df.score==0,'score']=np.nan
df1.merge(df2,on='name',how='left').fillna(method='bfill',axis=1)[['name','score_x']]\
    .rename(columns={'score_x':'score'})

This method changes the order (the result will be sorted by name ). 此方法更改顺序(结果将按name排序)。

df1.set_index('name').replace(0, np.nan).combine_first(df2.set_index('name')).reset_index()

  name  score
0    A     10
1    A     10
2    A     45
3    A     10
4    A     10
5    B     32
6    B     20
7    C     30

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何根据另一个数据框中的列填充数据框中的空值? - How to fill empty values in a dataframe based on columns in another dataframe? 根据与另一个 dataframe 重复的匹配列填充 dataframe - Fill dataframe based on matching columns with another dataframe with duplicates 熊猫从另一个数据帧填充一个数据帧上的空值 - Pandas fill empty values on one dataframe from another dataframe 如果其他两个列在Pandas中具有匹配的值,如何用另一个数据框的值填充空列的值? - How to fill empty column values with another dataframe's value if two other columns have matching values in Pandas? 根据两者的索引将一个数据帧中的值填充到另一个数据帧中 - Fill values from one dataframe into another dataframe based on index of the two Pandas DataFrame - 根据其他列的值填充列的 NaN - Pandas DataFrame - Fill NaNs of columns based on values of other columns 根据另一个数据框中的列填充一个数据框中的空值? - Filling empty values in one dataframe based on column in another dataframe? 根据 Python 中另一个 dataframe 中的多列合并和填充缺失值 - Merge and fill missing values based on multiple columns from another dataframe in Python 根据另一个数据帧将列添加到 Pandas 数据帧并将值设置为零 - Add columns to Pandas dataframe based on another dataframe and set values to zero 根据 Python 中另一个 dataframe 的行值从 dataframe 中获取列? - Taking columns from a dataframe based on row values of another dataframe in Python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM