简体   繁体   中英

Pandas dataframe get unique value of a column

I'm trying to get the unique available value for each site. The original pandas dataframe is with three columns:

Site Available Capacity
A 7 20
A 7 20
A 8 20
B 15 35
B 15 35
C 12 25
C 12 25
C 11 25

and I want to get the unique available of each site. The desired table is like below:

Site Unique Available
A 7
8
B 15
C 12
11

You can get the lists of unique available per site with GroupBy.unique()

>>> df.groupby('Site')['Available'].unique()
Site
A      [7, 8]
B        [15]
C    [12, 11]
Name: Available, dtype: object

Then with explode() you can expand these lists and with reset_index() get the index back to a column:

>>> df.groupby('Site')['Available'].unique().explode().reset_index()
  Site Available
0    A         7
1    A         8
2    B        15
3    C        12
4    C        11

Otherwise simply get both columns and remove duplicates:

>>> df[['Site', 'Available']].drop_duplicates()
  Site  Available
0    A          7
2    A          8
3    B         15
5    C         12
7    C         11

Approach with: GroupBy.apply() + Series.drop_duplicates()

(df.groupby('Site')['Available']
   .apply(lambda s: s.drop_duplicates())
   .reset_index(level=1, drop=True)
   .reset_index(name='Unique Available')
)

Result:

  Site  Unique Available
0    A                 7
1    A                 8
2    B                15
3    C                12
4    C                11

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM