简体   繁体   English

在 Pandas 中,如何根据仅影响某些条目的条件过滤所有行以获得唯一 ID?

[英]In Pandas, how can I filter all rows for a unique ID based on a condition affecting only certain entries?

I am working with a dataframe like this:我正在使用这样的 dataframe:

import pandas as pd
import datetime
records = [{'Name':'John', 'Start':'2020-01-01','Stop':'2020-03-31'}, {'Name':'John', 'Start':'2020-04-01','Stop':'2020-12-31'}, 
       {'Name':'Mary', 'Start':'2020-01-01','Stop':'2020-03-15'}, {'Name':'Mary', 'Start':'2020-03-16','Stop':'2020-03-31'}, 
       {'Name':'Mary', 'Start':'2020-04-01','Stop':'2020-12-31'}, {'Name':'Stan', 'Start':'2020-02-01','Stop':'2020-03-31'},
       {'Name':'Stan', 'Start':'2020-04-01','Stop':'2020-12-31'}]
df = pd.DataFrame(records)
df['Start'] = pd.to_datetime(df['Start'])
df['Stop'] = pd.to_datetime(df['Stop'])
df

which gives the output这给出了 output

Name         Start       Stop
0   John    2020-01-01  2020-03-31
1   John    2020-04-01  2020-12-31
2   Mary    2020-01-01  2020-03-15
3   Mary    2020-03-16  2020-03-31
4   Mary    2020-04-01  2020-12-31
5   Stan    2020-02-01  2020-03-31
6   Stan    2020-04-01  2020-12-31

What I want to do is select all the records for all the individuals who have a start date of 2020-01-01.我想做的是 select 所有开始日期为 2020-01-01 的个人的所有记录。 That is, if someone doesn't have a record beginning on 1/1, then I don't want any of their records.也就是说,如果某人没有从 1/1 开始的记录,那么我不想要他们的任何记录。 The results should give me this:结果应该给我这个:

    Name    Start   Stop
0   John    2020-01-01  2020-03-31
1   John    2020-04-01  2020-12-31
2   Mary    2020-01-01  2020-03-15
3   Mary    2020-03-16  2020-03-31
4   Mary    2020-04-01  2020-12-31

There should be no records for Stan in the output, because none of his entries start with 2020-01-01. output 中应该没有 Stan 的记录,因为他的条目都没有以 2020-01-01 开头。 Any ideas on how to accomplish this?关于如何做到这一点的任何想法? Thanks!谢谢!

Try the condition grouped by transform:尝试按变换分组的条件:

df[df['Start'].eq("2020-01-01").groupby(df["Name"]).transform('any')]

   Name      Start       Stop
0  John 2020-01-01 2020-03-31
1  John 2020-04-01 2020-12-31
2  Mary 2020-01-01 2020-03-15
3  Mary 2020-03-16 2020-03-31
4  Mary 2020-04-01 2020-12-31

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何根据特定条件删除 Pandas DataFrame 的行? - How do I remove rows of a Pandas DataFrame based on a certain condition? Pandas 如何在不影响其他列的情况下替换某些列中的所有行? - Pandas How to replace all rows in certain columns without affecting the others? Python Pandas:如何仅基于某些列来唯一化表? - Python Pandas: How I can unique my table only based on certain columns? 根据唯一条件过滤行 - Filter rows based on unique condition 如何仅根据熊猫中的标签选择某些行? - How do i select only certain rows based on label in pandas? 根据 Pandas 中的条件过滤行 - Filter rows based on condition in Pandas 如何根据条件删除 pandas 中的行 - How can I drop rows in pandas based on a condition 如何根据 pandas 中的条件组合按时间顺序排列的行? - How can I combine chronologically consecutive rows based on a condition in pandas? 使用 Pandas,如何根据给定条件下前几行的平均值将公式应用于多行? - Using Pandas, how to I apply a formula to several rows, based on the average of previous rows given a certain condition? 在熊猫中,如何过滤所有值都高于某个阈值的行? 并将索引列与输出保持一致? - In pandas, how can I filter for rows where ALL values are higher than a certain threshold? And keep the index columns with the output?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM