简体   繁体   中英

How to find all rows that meet a condition in Panda

import numpy as np
import pandas as pd

df = pd.read_csv('Salaries.csv',engine='python')

print( df[ df['JobTitle'].value_counts()==1 ] )

I'm trying to get the row if the Job in JobTitle appears once.

However, I keep getting this error: pandas.core.indexing.IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).

Here is the Salaries.csv file:

Id,EmployeeName,JobTitle,BasePay,OvertimePay,OtherPay,Benefits,TotalPay,TotalPayBenefits,Year,Notes,Agency,Status 1,NATHANIEL FORD,GENERAL MANAGER-METROPOLITAN TRANSIT AUTHORITY,167411.18,0.0,400184.25,,567595.43,567595.43,2011,,San Francisco, 2,GARY JIMENEZ,CAPTAIN III (POLICE DEPARTMENT),155966.02,245131.88,137811.38,,538909.28,538909.28,2011,,San Francisco, 3,ALBERT PARDINI,CAPTAIN III (POLICE DEPARTMENT),212739.13,106088.18,16452.6,,335279.91,335279.91,2011,,San Francisco, 4,CHRISTOPHER CHONG,WIRE ROPE CABLE MAINTENANCE MECHANIC,77916.0,56120.71,198306.9,,332343.61,332343.61,2011,,San Francisco,

Sorry if that's hard to read - if it is, here is a pastebin: https://pastebin.com/raw/eCfVj1Et

Another solution using transform :

df[df.groupby('JobTitle')['JobTitle'].transform('count').eq(1)]

You can do it in a single line of code combining the index values of value_counts() where the series is equal to 1:

df[df['A'].isin((df['A'].value_counts() == 1).replace({False:np.nan}).dropna().index)]

Perhaps a bit better and easier to understand, in two lines of code:

values = df['A'].value_counts()
df[df['A'].isin(values.index[values.eq(1)])]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM