简体   繁体   中英

Get unique values of multiple columns as a new dataframe in pandas

Having pandas data frame df with at least columns C1,C2,C3 how would you get all the unique C1,C2,C3 values as a new DataFrame?

in other words, similiar to :

SELECT C1,C2,C3
FROM T
GROUP BY C1,C2,C3

Tried that

print df.groupby(by=['C1','C2','C3'])

but im getting

<pandas.core.groupby.DataFrameGroupBy object at 0x000000000769A9E8>

I believe you need drop_duplicates if want all unique triples:

df = df.drop_duplicates(subset=['C1','C2','C3'])

If want use groupby add first :

df = df.groupby(by=['C1','C2','C3'], as_index=False).first()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM