简体   繁体   中英

Replace blank value in dataframe based on another column condition

I have many blanks in a merged data set and I want to fill them with a condition.

My current code looks like this

import pandas as pd
import csv
import numpy as np
pd.set_option('display.max_columns', 500)
# Read all files into pandas dataframes
Jan = pd.read_csv(r'C:\~\Documents\Jan.csv')
Feb = pd.read_csv(r'C:\~\Documents\Feb.csv')
Mar = pd.read_csv(r'C:\~\Documents\Mar.csv')

Jan=pd.DataFrame({'Department':['52','5','56','70','7'],'Item':['2515','254','818','','']})
Feb=pd.DataFrame({'Department':['52','56','765','7','40'],'Item':['2515','818','524','','']})
Mar=pd.DataFrame({'Department':['7','70','5','8','52'],'Item':['45','','818','','']})

all_df_list = [Jan, Feb, Mar]
appended_df = pd.concat(all_df_list)
df = appended_df
df.to_csv(r"C:\~\Documents\SallesDS.csv", index=False)

Data set:

df
Department     Item
52             2515
5              254
56             818
70
7              50
52             2515
56             818
765            524
7
40
7              45
70
5              818
8
52

What I want is to fill the empty cells in Item with a correspondent values of the Department column.

So If Department is 52 and Item is empty it should be filled with 2515 Department 7 and Item is empty fill it with 45 and the result should look like this

df
Department     Item
52             2515
5              254
56             818
70
7              50
52             2515
56             818
765            524
7              45
40
7              45
70
5              818
8
52             2515

I tried the following method but non of them worked. 1

df.loc[(df['Item'].isna()) & (df['Department'].str.contains(52)), 'Item'] = 2515
df.loc[(df['Item'].isna()) & (df['Department'].str.contains(7)), 'Item'] = 45

2

df["Item"] = df["Item"].fillna(df["Department"])
df = df.replace({"Item":{"52":"2515", "7":"45"}})

both ethir return error or do not work


Answer:

Hi I have used the below code and it worked

b = [52]
df.Item=np.where(df.Department.isin(b),df.Item.fillna(2515),df.Item)
a = [7]
df.Item=np.where(df.Department.isin(a),df.Item.fillna(45),df.Item)

Hope it helps someone who face the same issue

The following solution first creates a map of each department and it's maximum corresponding item (assuming there is one), and then matches that item to a department with a blank item. Note that in your data frame, the empty items are an empty string ( "" ) and not NaN .

Create a map:

values = df.groupby('Department').max()
values['Item'] = values['Item'].apply(lambda x: np.nan if x == "" else x)
values = values.dropna().reset_index()

    Department  Item
0   5   818
1   52  2515
2   56  818
3   7   45
4   765 524

Then use df.apply() :

df['Item'] = df.apply(lambda x: values[values['Department'] == x['Department']]['Item'].values if x['Item'] == "" else x['Item'], axis=1)

In this case, the new values will have brackets around them. They can be removed with str.replace() :

df['Item'] = df['Item'].astype(str).str.replace(r'\[|\'|\'|\]', "", regex=True)

The result:

Department  Item
0   52  2515
1   5   254
2   56  818
3   70  
4   7   45
0   52  2515
1   56  818
2   765 524
3   7   45
4   40  
0   7   45
1   70  
2   5   818
3   8   
4   52  2515

Hi I have used the below code and it worked

b = [52]
df.Item=np.where(df.Department.isin(b),df.Item.fillna(2515),df.Item)
a = [7]
df.Item=np.where(df.Department.isin(a),df.Item.fillna(45),df.Item)

Hope it helps someone who face the same issue

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM