简体   繁体   中英

Change specific values of dataframe columns, what is the most efficient way?

I need to change the values of specific items in the dataframe column, I used a for loop to do it manually, is there a way using idioms or .where, that is more efficient? I believe the code below is not the best way to do it...

# change the names of the countries as requested
for index, row in energy.iterrows(): #change the name of specific 
countries
if energy.loc[index, ['Country']].str.contains('United States of 
America').bool():
    energy.loc[index, ['Country']] = 'United States'
    print(energy.loc[index, ['Country']])

if energy.loc[index, ['Country']].str.contains('Republic of 
Korea').bool():
    energy.loc[index, ['Country']] = 'South Korea'
    print(energy.loc[index, ['Country']])

if energy.loc[index, ['Country']].str.contains('United Kingdom of Great 
Britain and Northern Ireland').bool():
    energy.loc[index, ['Country']] = 'United Kingdom'
    print(energy.loc[index, ['Country']])

if energy.loc[index, ['Country']].str.contains('China, Hong Kong 
Special Administrative Region').bool():
    energy.loc[index, ['Country']] = 'Hong Kong'
    print(energy.loc[index, ['Country']])

You could use np.where

energy['Country'] = np.where(energy['Country'] == 'United States of America', 'United States', energy['Country'] )
energy['Country'] = np.where(energy['Country'] == 'Republic of Korea', 'Korea', energy['Country'])

Or:

energy['Country'][energy['Country'] == 'United States of America'] = 'United States'
energy['Country'][energy['Country'] == 'Republic of Korea'] = 'Korea'

df:

                    Country
0  United States of America
1                     Spain
2         Republic of Korea
3                    France

Output:

         Country
0  United States
1          Spain
2          Korea
3         France

You can declare a dictionary with the mapping and then use map

Ex:

import pandas as pd

mapVal = {'United States of America': 'United States', 'Republic of Korea': 'South Korea', 'United Kingdom of Great Britain and Northern Ireland': 'United Kingdom', 'China': 'Hong Kong', 'Hong Kong Special Administrative Region': 'Hong Kong'}    #Sample Mapping

df = pd.DataFrame({'Country': ['United States of America', 'Republic of Korea', 'United Kingdom of Great Britain and Northern Ireland', 'China', 'Hong Kong Special Administrative Region']})
df["newVal"] = df["Country"].map(mapVal)          #df["Country"] = df["Country"].map(mapVal)
print(df)

Output:

                                             Country          newVal
0                           United States of America   United States
1                                  Republic of Korea     South Korea
2  United Kingdom of Great Britain and Northern I...  United Kingdom
3                                              China       Hong Kong
4            Hong Kong Special Administrative Region       Hong Kong

You can use the Pandas replace() method:

energy
                                             Country
0                           United States of America
1                                  Republic of Korea
2  United Kingdom of Great Britain and Northern I...
3     China, Hong Kong Special Administrative Region

energy.replace(rep_map)
          Country
0   United States
1     South Korea
2  United Kingdom
3       Hong Kong

Note that replace() will replace all instances of these strings across a data frame.

Data:

countries = ["United States of America", 
             "Republic of Korea", 
             "United Kingdom of Great Britain and Northern Ireland", 
             "China, Hong Kong Special Administrative Region"]
replacements = ["United States", "South Korea", "United Kingdom", "Hong Kong"]
rep_map = {k:v for k, v in zip(countries, replacements)}
energy = pd.DataFrame({"Country": countries})

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM