[英]Removing from pandas dataframe all rows having less than 3 characters
I have this dataframe我有这个数据框
Word Frequency
0 : 79
1 , 60
2 look 26
3 e 26
4 a 25
... ... ...
95 trump 2
96 election 2
97 step 2
98 day 2
99 university 2
I would like to remove all words having less than 3 characters.我想删除所有少于 3 个字符的单词。 I tried as follows:
我试过如下:
df['Word']=df['Word'].str.findall('\w{3,}').str.join(' ')
but it does not remove them from my datataset.但它不会从我的数据集中删除它们。 Can you please tell me how to remove them?
你能告诉我如何删除它们吗? My expected output would be:
我的预期输出是:
Word Frequency
2 look 26
... ... ...
95 trump 2
96 election 2
97 step 2
98 day 2
99 university 2
试试
df = df[df['Word'].str.len()>=3]
Instead of attempting a regular expression, you can use .str.len()
to get the length of each string of your column.您可以使用
.str.len()
来获取列中每个字符串的长度,而不是尝试使用正则表达式。 Then you can simply filter based on that length for >= 3
然后您可以简单地根据该长度进行过滤
>= 3
Should look like:应该看起来像:
df.loc[df["Word"].str.len() >= 3]
请尝试
df[df.Word.str.len()>=3]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.