[英]Filtering the data based on values in the columns in pandas dataframe
I have been working on some data lately.我最近一直在研究一些数据。 During the filtering process, I found some columns have some issue.
在过滤过程中,我发现一些列有一些问题。 I want to only keep those rows which have ')' present at the last in the Branch columns.
我只想在 Branch 列的最后保留那些带有 ')' 的行。
I have tried several options but the i want to find fastest way to go around it.我尝试了几种选择,但我想找到最快的方法来解决它。
Since you did not provide your data as text, I have created an example dataframe:由于您没有以文本形式提供数据,我创建了一个示例数据框:
Input:输入:
d = {'college_name': ['College {}'.format(i+1) for i in range(8)], 'branch': ['Civil Enigineering '+ '(4 Years)'*(i%2) for i in range(8)]}
df = pd.DataFrame(data=d, columns=['college_name','branch'])
df
Output:输出:
college_name branch
0 College 1 Civil Enigineering
1 College 2 Civil Enigineering (4 Years)
2 College 3 Civil Enigineering
3 College 4 Civil Enigineering (4 Years)
4 College 5 Civil Enigineering
5 College 6 Civil Enigineering (4 Years)
6 College 7 Civil Enigineering
7 College 8 Civil Enigineering (4 Years)
Pandas series have built in string processing methods. Pandas 系列内置了字符串处理方法。 You can use str.endswith(')') to filter your data.
您可以使用 str.endswith(')') 来过滤您的数据。 Notice that
df['branch'].str.endswith(')')
will return a boolean mask.请注意
df['branch'].str.endswith(')')
将返回一个布尔掩码。
Input:输入:
df[df['branch'].str.endswith(')')]
Output:输出:
college_name branch
1 College 2 Civil Enigineering (4 Years)
3 College 4 Civil Enigineering (4 Years)
5 College 6 Civil Enigineering (4 Years)
7 College 8 Civil Enigineering (4 Years)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.