简体   繁体   中英

Change A data frame columns with condition

sample dataframe

CountryName

India|Pakistan
Pakistan|Agansitan
Sweden
Nepal|Bhutan

Output dataframe witha new column

CountryName           MainCountry

India|Pakistan        India
Pakistan|Agansitan    Pakistan
Sweden                Sweden
Nepal|Bhutan          Nepal

I tried like

df["MainCountry"] =df['CountryName'].str.contains("[|].*","")

its giving true or false , can you help me in finding out how to get that

You could

In [87]: df['MainCountry'] = df['CountryName'].str.split('|').str[0]

In [88]: df
Out[88]:
          CountryName MainCountry
0      India|Pakistan       India
1  Pakistan|Agansitan    Pakistan
2              Sweden      Sweden
3        Nepal|Bhutan       Nepal

Using str.extract

df.assign(MainCountry=df.CountryName.str.extract(r'(.*?)(?:\||$)'))

          CountryName MainCountry
0      India|Pakistan       India
1  Pakistan|Agansitan    Pakistan
2              Sweden      Sweden
3        Nepal|Bhutan       Nepal 

Or str.partition

df.assign(MainCountry=df.CountryName.str.partition('|')[0])

          CountryName MainCountry
0      India|Pakistan       India
1  Pakistan|Agansitan    Pakistan
2              Sweden      Sweden
3        Nepal|Bhutan       Nepal

使用str.splitstr.get

df.CountryName.str.split('|').str.get(0)

Using Where

df['Main_Country'] = (np.where(df['CountryName'].str.contains('|'),
                  df['CountryName'].str.split('|').str[0],
                  df['CountryName']))

Output:

    CountryName       Main_Country
0   India|Pakistan      India
1   Pakistan|Agansitan  Pakistan
2   Sweden              Sweden
3   Nepal|Bhutan        Nepal

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM