简体   繁体   中英

change column values in one df to match column values in different df?

I have 2 data frames I want to merge based on the column name . The name column in one df has abbreviated versions while the name column in the other df has the full name. what is the most efficient way to change the column names to match each other?

df1[names] = ["Man Utd", "Man City", "Chelsea", "Liverpool", "Spurs", "Arsenal"]
df2[names] = ["Manchester United", "Manchester City", "Chelsea FC", "Liverpool FC", "Tottenham Hotspurs", "Arsenal FC"]

You can create a dictionary like below using dict(zip())

df1['names']  = ["Man Utd", "Man City", "Chelsea", "Liverpool", "Spurs", "Arsenal"]

df2['names']  = ["Manchester United", "Manchester City", "Chelsea FC", "Liverpool FC", "Tottenham Hotspurs", "Arsenal FC"]
d=dict(zip(df1['names'],df2['names'])) #created a mapping dictionary
print(d)

{'Man Utd': 'Manchester United',
'Man City': 'Manchester City',
 'Chelsea': 'Chelsea FC',
 'Liverpool': 'Liverpool FC',
 'Spurs': 'Tottenham Hotspurs',
 'Arsenal': 'Arsenal FC'}

Then change df1[names] by

df1[names]=df1[names].map(d)

Post this you can perform merge as column names are same now.

The only way you can achieve it is to maintain a referential it order to match the two names columns

df1 = pd.DataFrame()

referential = {
    "Man Utd": "Manchester United",
    "Man City": "Manchester City",
    "Chelsea": "Chelsea FC",
    "Liverpool": "Liverpool FC",
    "Spurs": "Tottenham Hotspurs",
    "Arsenal": "Arsenal FC"
}

df1['names'] = ["Man Utd", "Man City", "Chelsea", "Liverpool", "Spurs", "Arsenal"]
df1['names'] = df1['names'].map(referential)
print(df1)

Constructing a dictionary and then feeding to pd.Series.map is one way. But, sticking with Pandas, you can also use pd.Series.replace directly:

lst1 = ["Man Utd", "Man City", "Chelsea", "Liverpool", "Spurs", "Arsenal"]
lst2 = ["Manchester United", "Manchester City", "Chelsea FC", "Liverpool FC",
        "Tottenham Hotspurs", "Arsenal FC"]

# define input dictionary
df = pd.DataFrame({'names': lst1})    

# replace values in lst1 by lst2, by index
df['names'] = df['names'].replace(lst1, lst2)

print(df)

                names
0   Manchester United
1     Manchester City
2          Chelsea FC
3        Liverpool FC
4  Tottenham Hotspurs
5          Arsenal FC

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM