Create multiple new pandas column based on other columns in a loop

Question

Assuming I have the following toy dataframe, df :

Country     Population    Region          HDI

China        100          Asia           High  
Canada        15          NAmerica     V.High  
Mexico        25          NAmerica     Medium 
Ethiopia      30            Africa        Low

I would like to create new columns based on the population, region, and HDI of Ethiopia in a loop. I tried the following method, but it is time-consuming when a lot of columns are involved.

df['Population_2'] = df['Population'][df['Country'] == "Ethiopia"]
df['Region_2'] = df['Region'][df['Country'] == "Ethiopia"]
df['Population_2'].fillna(method='ffill')

My final DataFrame df should look like:

Country     Population    Region         HDI    Population_2   Region_2    HDI_2

China        100          Asia          High      30            Africa       Low 
Canada        15          NAmerica    V.High      30            Africa       Low 
Mexico        25          NAmerica    Medium      30            Africa       Low 
Ethiopia      30            Africa       Low      30            Africa       Low

Answer 1

How about this?

for col in ['Population', 'Region', 'HDI']:
    df[col + '_2'] = df.loc[df.Country=='Ethiopia', col].iat[0]

I don't quite understand the broader point of what you're trying to do, and if Ethiopia could have multiple values the solution might be different. But this works for the problem as you presented it.

Answer 2

You can use:

# select Ethiopia row and add suffix "_2" to the columns (except Country)
s = (df.drop(columns='Country')
       .loc[df['Country'].eq('Ethiopia')].add_suffix('_2').squeeze()
     )

# broadcast as new columns
df[s.index] = s

output:

    Country  Population    Region     HDI  Population_2 Region_2 HDI_2
0     China         100      Asia    High            30   Africa   Low
1    Canada          15  NAmerica  V.High            30   Africa   Low
2    Mexico          25  NAmerica  Medium            30   Africa   Low
3  Ethiopia          30    Africa     Low            30   Africa   Low

Answer 3

You can use assign and also assuming that you have only row corresponding to Ethiopia :

d = dict(zip(df.columns.drop('Country').map('{}_2'.format), 
         df.set_index('Country').loc['Ethiopia']))

df = df.assign(**d)

print(df):

    Country  Population    Region     HDI  Population_2 Region_2 HDI_2
0     China         100      Asia    High            30   Africa   Low
1    Canada          15  NAmerica  V.High            30   Africa   Low
2    Mexico          25  NAmerica  Medium            30   Africa   Low
3  Ethiopia          30    Africa     Low            30   Africa   Low

Create multiple new pandas column based on other columns in a loop

Question

3 answers

solution1
2 ACCPTED 2022-07-26 13:33:44

solution2
2 2022-07-26 13:36:07

solution3
2 2022-07-26 13:41:57

Create multiple new pandas column based on other columns in a loop

Question

3 answers

solution1 2 ACCPTED 2022-07-26 13:33:44

solution2 2 2022-07-26 13:36:07

solution3 2 2022-07-26 13:41:57

solution1
2 ACCPTED 2022-07-26 13:33:44

solution2
2 2022-07-26 13:36:07

solution3
2 2022-07-26 13:41:57