[英]Python Pandas extract unique values from a column and another column
I am studying pandas, bokeh etc. to get started with Data Vizualisation.我正在研究 pandas、散景等,以开始使用数据可视化。 Right now I am practising with a giant table containing different birds.
现在我正在用一张巨大的桌子练习,里面有不同的鸟。 There are plenty of columns;
有很多列; two of those columns are "SCIENTIFIC NAME" and another one is "OBSERVATION COUNT".
其中两列是“SCIENTIFIC NAME”,另一列是“OBSERVATION COUNT”。 I want to extract those two columns.
我想提取这两列。
I did我做了
df2 = df[["SCIENTIFIC NAME" , "OBSERVATION COUNT"]]
but the problem then is, that every entry is inside the table (since sometimes there are multiple entries/rows due to other columns of the same SCIENTIFIC NAME, but the OBSERVATION COUNT is always the same for the scientific name)但问题是,每个条目都在表内(因为有时由于相同科学名称的其他列存在多个条目/行,但科学名称的观察计数始终相同)
How can I get those two sectors but with the unique values, so every scientific name once, with the corresonding observation count.我怎样才能获得这两个部门但具有独特的价值,所以每个科学名称一次,具有相应的观察计数。
EDIT: I just realized that sometimes the same scientific names have different observation counts due to another column.编辑:我刚刚意识到有时相同的学名由于另一列而具有不同的观察计数。 Is there a way to extract every first unique item from a column
有没有办法从列中提取每个第一个唯一项目
IIUC, You can use drop_duplicates
: IIUC,您可以使用
drop_duplicates
:
df2 = df[["SCIENTIFIC NAME" , "OBSERVATION COUNT"]].drop_duplicates()
To get counts:要获得计数:
df2 = df.groupby(["SCIENTIFIC NAME" , "OBSERVATION COUNT"])["SCIENTIFIC NAME"].count()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.