I have this dataframe
Word Frequency
0 : 79
1 , 60
2 look 26
3 e 26
4 a 25
... ... ...
95 trump 2
96 election 2
97 step 2
98 day 2
99 university 2
I would like to remove all words having less than 3 characters. I tried as follows:
df['Word']=df['Word'].str.findall('\w{3,}').str.join(' ')
but it does not remove them from my datataset. Can you please tell me how to remove them? My expected output would be:
Word Frequency
2 look 26
... ... ...
95 trump 2
96 election 2
97 step 2
98 day 2
99 university 2
试试
df = df[df['Word'].str.len()>=3]
Instead of attempting a regular expression, you can use .str.len()
to get the length of each string of your column. Then you can simply filter based on that length for >= 3
Should look like:
df.loc[df["Word"].str.len() >= 3]
请尝试
df[df.Word.str.len()>=3]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.