简体   繁体   中英

how do i replace set of values in a dataframe column with values from another dataframe?

I have two dataframes one with some missing values, and another with values that need to replace the missing values. So, the 2nd dataframe is shorter in length than the 1st one.

The missing values in the first dataframe is noted by either "Height Info Not Found" or "Player Info Not Found"

Is there a way to replace the missing values in the first dataframe with the corresponding values from the 2nd dataframe without looping?

I tried using.map() but values not replaced are returned NaN.

filled_df['height']= filled_df['height'].astype(str) #dataframe with real values
main_df['height']= main_df['height'].astype(str) #dataframe with missing values 

mapping = dict(filled_df[['name','height']].values)
main_df['height'] = main_df['url_names'].map(mapping,na_action='ignore')
print(main_df)

                   name              url_names                 height
0            John Mcenroe           John_Mcenroe  Height Info Not Found
1           Jimmy Connors          Jimmy_Connors  Player Info Not Found
2              Ivan Lendl             Ivan_Lendl     1.88 m (6 ft 2 in)
3           Mats Wilander          Mats_Wilander     1.83 m (6 ft 0 in)
4            Andres Gomez           Andres_Gomez     1.93 m (6 ft 4 in)
5           Anders Jarryd          Anders_Jarryd    1.80 m (5 ft 11 in)
6        Henrik Sundstrom       Henrik_Sundstrom     1.88 m (6 ft 2 in)
7                Pat Cash               Pat_Cash  Height Info Not Found
8         Eliot Teltscher        Eliot_Teltscher     1.75 m (5 ft 9 in)
9            Yannick Noah           Yannick_Noah     1.93 m (6 ft 4 in)
10         Joakim Nystrom         Joakim_Nystrom     1.87 m (6 ft 2 in)
11       Aaron Krickstein       Aaron_Krickstein     6 ft 2 in (1.88 m)
12            Johan Kriek            Johan_Kriek     1.75 m (5 ft 9 in)


                   name height
0          John_Mcenroe   1.80
1         Jimmy_Connors   1.78
2              Pat_Cash    183
3           Jimmy_Arias    175
4         Juan_Aguilera   1.82
5         Henri_Leconte   1.84
6        Balazs_Taroczy   1.82
7    Sammy_Giammalva_Jr   1.78
8       Thierry_Tulasne   1.77

Ithink you need replace only misisng values by matched values by dictionary:

main_df['height'] = main_df['height'].fillna(main_df['url_names'].map(mapping))

This code can do the job

import pandas as pd

d = {'url_names': ['John_Mcenroe', 'Jimmy_Connors', 'Ivan_Lendl'], 'height': ['Height Info Not Found', 'Player Info Not Found', '1.88 m (6 ft 2 in)']}
main_df = pd.DataFrame(d)

d = {'url_names': ['John_Mcenroe', 'Jimmy_Connors'], 'height': ['1.80', '1.78']}
filled_df = pd.DataFrame(d)
df1 = main_df[(main_df.height == 'Height Info Not Found') | (main_df.height == 'Player Info Not Found')].drop(['height'], axis=1).merge(filled_df, on="url_names")
df2 = main_df[(main_df.height != 'Height Info Not Found') & (main_df.height != 'Player Info Not Found')]
pd.concat([df1, df2])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM