简体   繁体   中英

Python Pandas extract unique values from a column and another column

I am studying pandas, bokeh etc. to get started with Data Vizualisation. Right now I am practising with a giant table containing different birds. There are plenty of columns; two of those columns are "SCIENTIFIC NAME" and another one is "OBSERVATION COUNT". I want to extract those two columns.

I did

df2 = df[["SCIENTIFIC NAME" , "OBSERVATION COUNT"]]

but the problem then is, that every entry is inside the table (since sometimes there are multiple entries/rows due to other columns of the same SCIENTIFIC NAME, but the OBSERVATION COUNT is always the same for the scientific name)

How can I get those two sectors but with the unique values, so every scientific name once, with the corresonding observation count.

EDIT: I just realized that sometimes the same scientific names have different observation counts due to another column. Is there a way to extract every first unique item from a column

IIUC, You can use drop_duplicates :

df2 = df[["SCIENTIFIC NAME" , "OBSERVATION COUNT"]].drop_duplicates()

To get counts:

df2 = df.groupby(["SCIENTIFIC NAME" , "OBSERVATION COUNT"])["SCIENTIFIC NAME"].count()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM