简体   繁体   中英

Pandas: How to map the values of a Dataframe to another Dataframe?

I am totally new to Python and just learning with some use cases I have.

I have 2 Data Frames, one is where I need the values in the Country Column, and another is having the values in the column named 'Countries' which needs to be mapped in the main Data Frame referring to the column named 'Data'. (Please accept my apology if this question has already been answered)

Below is the Main DataFrame:

Name Data                     | Country
----------------------------- | ---------
Arjun Kumar Reddy las Vegas   |
Divya london Khosla           |
new delhi Pragati Kumari      |
Will London Turner            |
Joseph Mascurenus Bombay      |
Jason New York Bourne         |
New york Vice Roy             |
Joseph Mascurenus new York    |
Peter Parker California       |
Bruce (istanbul) Wayne        |

Below is the Referenced DataFrame:

Data           | Countries
-------------- | ---------
las Vegas      | US
london         | UK
New Delhi      | IN
London         | UK
bombay         | IN
New York       | US
New york       | US
new York       | US
California     | US
istanbul       | TR
Moscow         | RS
Cape Town      | SA

And what I want in the result will look like below:

Name Data                     | Country
----------------------------- | ---------
Arjun Kumar Reddy las Vegas   | US
Divya london Khosla           | UK
new delhi Pragati Kumari      | IN
Will London Turner            | UK
Joseph Mascurenus Bombay      | IN
Jason New York Bourne         | US
New york Vice Roy             | US
Joseph Mascurenus new York    | US
Peter Parker California       | US
Bruce (istanbul) Wayne        | TR

Please note, Both the dataframes are not same in size. I though of using map or Fuzzywuzzy method but couldn't really achieved the result.

Find the country key that matches in the reference dataframe and extract it.

regex = '(' + ')|('.join(ref_df['Data']) + ')'
df['key'] = df['Name Data'].str.extract(regex, flags=re.I).bfill(axis=1)[0]

>>> df
                     Name Data        key
0  Arjun Kumar Reddy las Vegas  las Vegas
1       Bruce (istanbul) Wayne   istanbul
2   Joseph Mascurenus new York   new York


>>> ref_df
        Data Country
0  las Vegas      US
1   new York      US
2   istanbul      TR

Merge both the dataframes on key extracted.

pd.merge(df, ref_df, left_on='key', right_on='Data')
                     Name Data        key       Data Country
0  Arjun Kumar Reddy las Vegas  las Vegas  las Vegas      US
1       Bruce (istanbul) Wayne   istanbul   istanbul      TR
2   Joseph Mascurenus new York   new York   new York      US

看起来一切都已排序,因此您可以在索引上进行合并

mdf.merge(rdf, left_index=True, right_index=True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM