[英]How to filter rows and words in lower case in pandas dataframe?
Hi I would like to know how to select rows which contains lower cases in the following dataframe:嗨,我想知道如何在以下 dataframe 中包含小写字母的 select 行:
ID Name Note
1 Fin there IS A dog outside
2 Mik NOTHING TO DECLARE
3 Lau no house
What I would like to do is to filter rows where Note column contains at least one word in lower case:我想做的是过滤Note列至少包含一个小写单词的行:
ID Name Note
1 Fin there IS A dog outside
3 Lau no house
and collect in a list all the words in lower case: my_list=['there','dog','outside','no','house']
并在列表中收集所有小写单词: my_list=['there','dog','outside','no','house']
I have tried to filter rows is:我试图过滤行是:
df1=df['Note'].str.lower()
For appending words in the list, I think I should first tokenise the string, then select all the terms in lower case.对于在列表中附加单词,我认为我应该首先标记字符串,然后 select 所有小写术语。 Am I right?我对吗?
Use Series.str.contains
for filter at least one lowercase character in boolean indexing
:使用Series.str.contains
过滤boolean indexing
中的至少一个小写字符:
df1 = df[df['Note'].str.contains(r'[a-z]')]
print (df1)
ID Name Note
0 1 Fin there IS A dog outside
2 3 Lau no house
And then Series.str.extractall
for extract lowercase words:然后Series.str.extractall
用于提取小写单词:
my_list = df1['Note'].str.extractall(r'(\b[a-z]+\b)')[0].tolist()
print (my_list)
['there', 'dog', 'outside', 'no', 'house']
Or use list comprehension with split sentences and filter by islower
:或者使用拆分句子的列表理解并按islower
过滤:
my_list = [y for x in df1['Note'] for y in x.split() if y.islower()]
print (my_list)
['there', 'dog', 'outside', 'no', 'house']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.