简体   繁体   English

如何根据特定列从pandas数据框中选择相同的行

[英]How To Select Identical rows from pandas dataframe based on certain columns

I'm new to pandas and I'm having problem with row selections from dataframe. 我是pandas的新手,我遇到了来自数据帧的行选择问题。

Following is my DataFrame : 以下是我的DataFrame:

   Index    Column1 Column2 Column3 Column4
   0    1234    500 NEWYORK NY
   1    5678    700 AUSTIN  TX
   2    1234    300 NEWYORK NY
   3    8910    235 RICHMOND    FL

I want to select rows that are having same value in column1,column 3 and column4(identical rows in terms of these 3 columns). 我想选择在column1,column 3和column4中具有相同值的行(这3列中的相同行)。 So the output dataframe will contain rows with index 0 and 2. 因此输出数据帧将包含索引为0和2的行。

Can any one help me with a step-by-step procedure for this custom selection. 任何人都可以帮助我完成此自定义选择的分步过程。

Use df.duplicated as a mapper to index into df : 使用df.duplicated作为映射器来索引到df

c = ['Column1', 'Column3', 'Column4']
df = df[df[c].duplicated(keep=False)]

df

   Index  Column1  Column2  Column3 Column4
0      0     1234      500  NEWYORK      NY
2      2     1234      300  NEWYORK      NY

keep=False will mark all duplicate rows for filtering. keep=False将标记所有重复行以进行过滤。

Earler I was using following approach : 厄勒我使用以下方法:

d = df.T.to_dict()   

dup=[]
for i in d.keys():
    for j in d.keys():
        if i!=j:
            if d[i]['column1']==agg_d[j]['column1'] and d[i]['column3']==d[j]['column3'] and d[i]['column3']==d[j]['column3']:
                set(dup.append(k[i]['column1'])

dup_rows = df[df.loc['column1'].isin(dup)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何从 pandas DataFrame 数组列中的 select 行具有特定值 - How to select rows from pandas DataFrame array columns with certain value 如何从pandas数据框中选择相同的行以及null - How to select identical rows from a pandas dataframe along with null 如何在Pandas DataFrame中找到匹配的行,并且在某些列中具有相同/相反的符号? - How to find matching rows in Pandas DataFrame with identical values with same/opposite signs in certain columns? 根据熊猫中MULTIPLE列中的值从DataFrame中选择行 - Select rows from a DataFrame based on values in a MULTIPLE columns in pandas 如何在 pandas 中的另一个 DataFrame 的 2 列之间具有特定值的 select 行? - How to select rows with certain value between 2 columns from another DataFrame in pandas? 如何基于多列中的字符串匹配 Pandas dataframe 中的 select 行 - How to select rows in Pandas dataframe based on string matching in multiple columns 我们如何根据特定条件从 pandas dataframe select 列? - How can we select columns from a pandas dataframe based on a certain condition? 如何在一定条件下从熊猫数据框中选择行 - How to select rows from the pandas dataframe with certain conditions 如何根据特定条件从熊猫数据框中删除行 - How to delete rows from a pandas dataframe based on certain condition 从特定日期的pandas DataFrame中选择行 - Select rows from pandas DataFrame of a certain date
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM