简体   繁体   中英

Removing from pandas dataframe all rows having less than 3 characters

I have this dataframe

Word    Frequency
0   :       79
1   ,       60
2   look    26
3   e       26
4   a       25
... ... ...
95  trump    2
96  election 2
97  step     2
98  day      2
99  university  2

I would like to remove all words having less than 3 characters. I tried as follows:

df['Word']=df['Word'].str.findall('\w{3,}').str.join(' ')

but it does not remove them from my datataset. Can you please tell me how to remove them? My expected output would be:

Word    Frequency

2   look    26

... ... ...
95  trump    2
96  election 2
97  step     2
98  day      2
99  university  2

试试

df = df[df['Word'].str.len()>=3]

Instead of attempting a regular expression, you can use .str.len() to get the length of each string of your column. Then you can simply filter based on that length for >= 3

Should look like:

df.loc[df["Word"].str.len() >= 3]

请尝试

 df[df.Word.str.len()>=3]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM