I'm trying to find the best way to create new rows for every 1 row when a certain value is contained in a column.
Example Dataframe
Index | Person | Drink_Order |
---|---|---|
1 | Sam | Jack and Coke |
2 | John | Coke |
3 | Steve | Dr. Pepper |
I'd like to search the DataFrame for Jack and Coke, remove it and add 2 new records as Jack and Coke are 2 different drink sources.
Index | Person | Drink_Order |
---|---|---|
2 | John | Coke |
3 | Steve | Dr. Pepper |
4 | Sam | Jack Daniels |
5 | Sam | Coke |
Example Code that I want to replace as my understanding is you should never modify rows you are iterating
for index, row in df.loc[df['Drink_Order'].str.contains('Jack and Coke')].iterrows():
df.loc[len(df)]=[row['Person'],'Jack Daniels']
df.loc[len(df)]=[row['Person'],'Coke']
df = df[df['Drink_Order']!= 'Jack and Coke']
Split using and. That will result in a list. Explode list to get each element in a list appear as an individual row. Then conditionally rename Jack to Jack Daniels
df= df.assign(Drink_Order=df['Drink_Order'].str.split('and')).explode('Drink_Order')
df['Drink_Order']=np.where(df['Drink_Order'].str.contains('Jack'),'Jack Daniels',df['Drink_Order'])
Index Person Drink_Order
0 1 Sam Jack Daniels
0 1 Sam Coke
1 2 John Coke
2 3 Steve Dr. Pepper
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.