简体   繁体   English

多个熊猫数据框中列中值的频率

[英]frequency of values in column in multiple panda data frame

I have multiple panda data frames ( more than 70), each having same columns.我有多个熊猫数据框(超过 70 个),每个都有相同的列。 Let say there are only 10 rows in each data frame.假设每个数据框中只有 10 行。 I want to find the column A' value occurence in each of data frame and list it.我想在每个数据框中找到列 A' 值的出现并列出它。 Example:例子:

# Import pandas library 
import pandas as pd 
  
# initialize list of lists 
data = [['tom', 10], ['nick', 15], ['juli', 14]] 
  
# Create the pandas DataFrame 
df = pd.DataFrame(data, columns = ['Name', 'Age']) 

data = [['sam', 12], ['nick', 15], ['juli', 14]] 

df2 = pd.DataFrame(data, columns = ['Name', 'Age']) 

I am expecting the output as我期待输出为

Name  Age
 tom    1
 sam    1
nick    2
juli    2

You can do the following:您可以执行以下操作:

from collections import Counter

d={'df1':df1, 'df2':df2, ..., 'df70':df70}
l=[list(d[i]['Name']) for i in d]
m=sum(l, [])
result=Counter(m)

print(result)

Do you want value counts of Name column across all dataframes?您想要所有数据帧中Name列的值计数吗?

main = pd.concat([df,df2])
main["Name"].value_counts()

juli    2
nick    2
sam     1
tom     1
Name: Name, dtype: int64

You can try this:你可以试试这个:

df = pd.concat([df, df2]).groupby('Name', as_index=False).count()
df.rename(columns={'Age': 'Count'}, inplace=True)
print(df)

   Name  Count
0  juli    2
1  nick    2
2   sam    1
3   tom    1

This can work if your data frames are not costly to concat:如果您的数据框连接起来成本不高,这可以工作:

pd.concat([x['Name'] for x in [df,df2]]).value_counts()

nick    2
juli    2
tom     1
sam     1

You can try this:你可以试试这个:

df = pd.concat([df1,df2])
df = df.groupby(['Name'])['Age'].count().to_frame().reset_index()
df = df.rename(columns={"Age": "Count"})
print(df)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM