简体   繁体   English

Pandas/Python:如何使用函数过滤 Dataframe 或系列?

[英]Pandas/Python: How can I filter a Dataframe or Series by using functions?

I have a Dataframe df with among others the columns "Age" and "Name" in my Jupyter Notebook我有一个 Dataframe df,其中包括我的 Jupyter Notebook 中的“年龄”和“姓名”列

Usually, when I just want entries fulfilling certian criteria, I filter eg by通常,当我只想要满足某些条件的条目时,我会过滤例如

df[df["Age"]>20]

meaning "df where df["Age"] happens to be above 20 and therefore it shows only the entries of the df where age is above 20意思是“df,其中df["Age"]恰好高于 20,因此它仅显示年龄高于 20 的 df 条目

Now I want to get only the entries where the Name contains "Alex"现在我只想获取名称包含“Alex”的条目

df[df["Name"].find("Alex")>-1]         #(".find" returns -1 if Alex is not in in the checked string)

Acutually this does not work seemingly because this function gets applied to the whole series, which is obviously nonsense and therefore gives an error.实际上这似乎不起作用,因为这个 function 被应用于整个系列,这显然是无稽之谈,因此会出错。 I would not have expected this behavior because df["Age"]>20 in the first example also worked (meaning it got applied to every single cell of "Age" and not to the series itself).我不会预料到这种行为,因为第一个示例中的df["Age"]>20也有效(这意味着它被应用于“Age”的每个单元格而不是系列本身)。 Any ideas how I fix this?任何想法我如何解决这个问题?

Yours sincerely:)此致:)

alex_df = df[df['Name'] == 'Alex']  

If you have multiple names you can use the following.:如果您有多个名称,则可以使用以下名称:

name_list = ['Alex', 'Sam', 'Donna']

names_df = df[df['Name'].isin(name_list)] 

Searching for rows where column Name contains Alex as a substring is probably easiest with pandas string accessors:使用 pandas 字符串访问器搜索列Name包含Alex作为 ZE83AED3DDF4667DEC0DAAAACB2BB3BE0BZ 的行可能是最简单的:

df[df["Name"].str.contains("Alex")]

More details can be found in the documentation of pandas.更多详细信息可以在 pandas 的文档中找到。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM