[英]python pandas - groupby unstack pivot - count len - mixed values and export
I tried to combine data from a file of approximately 70,000 lines. 我试图合并来自大约70,000行文件的数据。 For several types of results, I need to export file (csv type for example)
对于几种类型的结果,我需要导出文件(例如,csv类型)
The file that contains the data after import returns this df: 导入后包含数据的文件将返回此df:
df = pd.DataFrame({
'sec_Id':["to","ti","tu","ta","ty","te"],
'sec_Orga':['CNP','COF','COF','POS','POS','POS'],
'sec_Etat':['Sorti(e)','Valide','Suspendu(e)','Valide','Suspendu(e)','Suspendu(e)']
})
df
Out[59]:
sec_Etat sec_Id sec_Orga
0 Sorti(e) to CNP
1 Valide ti COF
2 Suspendu(e) tu COF
3 Valide ta POS
4 Suspendu(e) ty POS
5 Suspendu(e) te POS
and in the end I have this souhaterais total result: 最后,我得到了这样的结果:
Total Valide Suspendu(e) Sorti(e)
CNP 1 0 0 1
COF 2 1 1 0
POS 3 1 2 0
as you see it's a values combining 'total' column with unique values of 'sec_Etat" column in title... 如您所见,它是一个结合了“总计”列和标题中“ sec_Etat”列的唯一值的值...
i'd tryed with groupby, unstake, pivot but nothing worked... 我曾尝试过groupby,unstake,pivot,但没有任何效果...
After that I must export the data for a csv file... how could I do that? 之后,我必须导出一个csv文件的数据...我该怎么做?
Thank you! 谢谢!
Use the pivot_table
method. 使用数据
pivot_table
方法。 For the aggfunc
argument, use len
. 对于
aggfunc
参数,请使用len
。 This will return the count of the items for the provided index
and column
. 这将返回提供的
index
和column
的项目计数。 Finally, just sum the rows along axis=1
. 最后,只需将
axis=1
的行相加即可。 Use .to_csv
to export. 使用
.to_csv
导出。
See code: 看到代码:
import pandas as pd
df = pd.DataFrame({
'sec_Id': ["to", "ti", "tu", "ta", "ty", "te"],
'sec_Orga': ['CNP', 'COF', 'COF', 'POS', 'POS', 'POS'],
'sec_Etat': ['Sorti(e)', 'Valide', 'Suspendu(e)', 'Valide', 'Suspendu(e)', 'Suspendu(e)']
})
pivot = df.pivot_table(index='sec_Orga', columns='sec_Etat', aggfunc=len)
pivot["total"] = pivot.sum(axis=1)
print pivot
# pivot.to_csv("p.csv") # Export to CSV file. Uncomment to use.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.