简体   繁体   中英

Is there a way to replace all except the first 3 instances of a value in a pandas dataframe column

I have a very large pandas dataframe, with the columns being users and the rows being yes/no questions about the user. So every cell in the dataframe contains "yes" or "no". I only want to see the first 3 "no"s in each column. I already replaced every "yes" in the dataframe with an empty string "". How can I keep the first 3 "no"s in every column(for every user) and replace the rest with an empty string. I thought I could use the limit parameter in df.replace() to do so but I haven't found any good explanation for what it does and experimenting with it myself hasn't helped. Thanks in advance for any help. My first time posting on Stack overflow so apologies in advance for any mistakes I made while asking this question.

Intial:

User 1 User 2 User 3
no no
no no
no no
no no
no no
no
no no

Expected Output:

User 1 User 2 User 3
no no
no no
no no
no
no

Use cumsum :

df = pd.DataFrame({'User1': ['', 'no', 'no', 'no', 'no'], 
                   'User2': ['no', 'no', 'no', 'no', '']})

df[(df == 'no').cumsum() > 3] = ''

Just an addition to @psarka 's answer, does your data-frame contain values other than "No" ? As @Psarka 's answer would not remove values other than "No".

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM