[英]Python - how to pass a dynamic Series name and DataFrame name as a function argument?
I want to write a function that receives a Panda data-frame and a given series name and retrieves the unique values of this series and their frequencies in the dataset. 我想编写一个函数,该函数接收熊猫数据框和给定的序列名称,并检索该序列的唯一值及其在数据集中的频率。
def getUniqueValuesByField(dataframe, fieldname):
''' Retrive for non-numerical series the unique values and their frequencies '''
result = dataframe.fieldname.apply(lambda x: pd.Series(x)).unstack().value_counts(normalize=True, sort=True, ascending=False, bins=None, dropna=True)
#dataframe[fieldname].unique()
return result
Then, I can call this function as following: 然后,我可以按以下方式调用此函数:
df = pd.DataFrame.from_dict(RequestsDict)
getUniqueValuesByField(df, 'detected_language')
getUniqueValuesByField(df, 'detected_vertical')
Is it possible? 可能吗? I tried to concatenate strings and use the eval() function but i'm not sure this is the correct way to do that.
我试图连接字符串并使用eval()函数,但是我不确定这是否是正确的方法。
Use .value_counts()
like so: 使用
.value_counts()
就像这样:
In [35]: df = pd.DataFrame(['foo','bar','baz', 'foo','bar'], columns=['test'])
In [36]: df['test'].value_counts()
Out[36]:
foo 2
bar 2
baz 1
dtype: int64
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.