简体   繁体   中英

Python function with pandas dataframe and column name as inputs

I am trying to write a function that takes a Pandas DataFrame (df) and a column name (col) as inputs and returns a list of all unique values in the column in sorted order. I am trying to do this without any module methods.

I am using the following code:

import pandas as pd

def list_col(df, col):
    """puts unique items of given column in a list"""
    f = pd.df()
    l = []
    r = f.loc[:,col]
    for i in r:
        if i not in l:
            l.append(i)
        return l.sort()

However, i get the error message:

AttributeError: module 'pandas' has no attribute 'df'

How can I fix this? Thanks!

I think there is possible use unique and call sorted :

def list_col(df, col):
    return sorted(df[col].unique())

Or convert to set , list and call sorted :

def list_col(df, col):
    return sorted(list(set(df[col])))

Sample :

df = pd.DataFrame({'A':list('ddcdef'),
                   'B':[4,5,4,5,5,4],
                   'F':list('acabbb')})

print (df)
   A  B  F
0  d  4  a
1  d  5  c
2  c  4  a
3  d  5  b
4  e  5  b
5  f  4  b

def list_col(df, col):
    return sorted(df[col].unique())

print (list_col(df, 'F'))
['a', 'b', 'c']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM