简体   繁体   中英

pandas replace specific string with numeric value in a new column for all rows

I have a data frame with a column message , and I want to create a column media such that if for index x, df.ix[x][message]=="<Media omitted>" ,then I want df.ix[x][media] = 1

for example for the dataframe:

index    message
1        hello
2        <Media omitted>
3        hello
4        <Media omitted>

I would get:

index    message          media
1        hello             0
2        <Media omitted>   1
3        hello             0
4        <Media omitted>   1

I tried to do so only by using a loop, but I'm sure there is a smarter and faster way.

Try this:

df['media'] = (df['message'] == '<Media omitted>').astype(int)

Explanation

  • df['message'] == '<Media omitted>' creates a Boolean series.
  • astype(int) casts the Boolean series as integer type for display purposes.

I think you need convert boolean mask to int by astype :

df['media'] = (df['message'] == '<Media omitted>').astype(int)
#very similar alternative
#df['media'] = df['message'].eq('<Media omitted>').astype(int)
print (df)
               message  media
index                        
1                hello      0
2      <Media omitted>      1
3                hello      0
4      <Media omitted>      1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM