如何根据条件在 dataframe 中的 select 行

Question

I have an emails dataframe in which I have given this query:我有一封电子邮件 dataframe 我在其中给出了以下查询：

williams = emails[emails["employee"] == "kean-s"]

This selects all the rows that have employee kean-s.这将选择所有具有员工 kean-s 的行。 Then I count the frequencies and print the top most.然后我计算频率并打印最多。 This is how it's done:这是如何完成的：

williams["X-Folder"].value_counts()[:10]

This gives output like this:这给 output 像这样：

attachments                   2026
california                     682
heat wave                      244
ferc                           188
pr-crisis management            92
federal legislation             88
rto                             78
india                           75
california - working group      72
environmental issues            71

Now, I need to print all the rows from emails that has X_Folder column equal to attachments, california, heat way etc. How do I go about it?现在，我需要打印电子邮件中 X_Folder 列等于附件、加利福尼亚、热方式等的所有行。我该如何处理它？ When I print values[0] it simply returns the frequency number and not the term corresponding to it (tried printing it because if I'm able to loop through it, Ill just put a condition inside dataframe)当我打印 values[0] 时，它只返回频率数而不是与其对应的术语（尝试打印它，因为如果我能够循环遍历它，我只会在数据帧中放置一个条件）

Answer 1

Use Series.isin with boolean indexing for values of index:将Series.isin与boolean indexing用于索引值：

df = williams[williams["X-Folder"].isin(williams["X-Folder"].value_counts()[:10].index)]

Or:或者：

df = williams[williams["X-Folder"].isin(williams["X-Folder"].value_counts().index[:10])]

If need filter all rows in original DataFrame (also rows with not matched kean-s ) then use:如果需要过滤原始DataFrame中的所有行（以及不匹配 kean kean-s行），则使用：

df1 = emails[emails["X-Folder"].isin(williams["X-Folder"].value_counts().index[:10])]

如何根据条件在 dataframe 中的 select 行

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-09-25 13:03:30

如何根据条件在 dataframe 中的 select 行

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-09-25 13:03:30

解决方案1
1 已采纳 2019-09-25 13:03:30