I wanted to remove certain rows from my pandas dataframe. I did it the manual way of spelling out each ITEM number that I didn't want included.
How do I do the same task as shown in the code below but using a loop?
df_adhoc_1_final = df_adhoc_1_final[df_adhoc_1_final['ITEM'].str.contains('4888') == False]
df_adhoc_1_final = df_adhoc_1_final[df_adhoc_1_final['ITEM'].str.contains('4889') == False]
df_adhoc_1_final = df_adhoc_1_final[df_adhoc_1_final['ITEM'].str.contains('4890') == False]
df_adhoc_1_final = df_adhoc_1_final[df_adhoc_1_final['ITEM'].str.contains('4891') == False]
df_adhoc_1_final = df_adhoc_1_final[df_adhoc_1_final['ITEM'].str.contains('4892') == False]
df_adhoc_1_final = df_adhoc_1_final[df_adhoc_1_final['ITEM'].str.contains('4893') == False]
A loop is unnecessary here. There is almost always a vectorised, non-loopy approach with any pandas operation. Here's one way to do it.
First, initialise a list of codes -
codes = ['4888', '4889', ... '4893']
Or,
codes = np.arange(4888, 4894).astype(str)
Now, filter using str.contains
. You'll need to join each code as a single regex using the |
OR pipe -
df = df[~df['ITEM'].str.contains('|'.join(codes))]
If the codes are the only thing in the ITEM
column, you can use isin
-
df = df[~df['ITEM'].isin(codes)]
how about:
for val in ['4888','4889','4890','4891','4892','4893']:
df_adhoc_1_final = df_adhoc_1_final[df_adhoc_1_final['ITEM'].str.contains(val) == False]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.