简体   繁体   English

python pandas-groupby卸除枢轴-计算len-混合值并导出

[英]python pandas - groupby unstack pivot - count len - mixed values and export

I tried to combine data from a file of approximately 70,000 lines. 我试图合并来自大约70,000行文件的数据。 For several types of results, I need to export file (csv type for example) 对于几种类型的结果,我需要导出文件(例如,csv类型)

The file that contains the data after import returns this df: 导入后包含数据的文件将返回此df:

df = pd.DataFrame({
 'sec_Id':["to","ti","tu","ta","ty","te"], 
 'sec_Orga':['CNP','COF','COF','POS','POS','POS'], 
 'sec_Etat':['Sorti(e)','Valide','Suspendu(e)','Valide','Suspendu(e)','Suspendu(e)']
 })


df
Out[59]: 
      sec_Etat  sec_Id  sec_Orga
0     Sorti(e)      to       CNP
1       Valide      ti       COF
2  Suspendu(e)      tu       COF
3       Valide      ta       POS
4  Suspendu(e)      ty       POS
5  Suspendu(e)      te       POS

and in the end I have this souhaterais total result: 最后,我得到了这样的结果:

      Total  Valide  Suspendu(e)  Sorti(e)
CNP       1       0            0         1
COF       2       1            1         0
POS       3       1            2         0

as you see it's a values combining 'total' column with unique values of 'sec_Etat" column in title... 如您所见,它是一个结合了“总计”列和标题中“ sec_Etat”列的唯一值的值...

i'd tryed with groupby, unstake, pivot but nothing worked... 我曾尝试过groupby,unstake,pivot,但没有任何效果...

After that I must export the data for a csv file... how could I do that? 之后,我必须导出一个csv文件的数据...我该怎么做?

Thank you! 谢谢!

Use the pivot_table method. 使用数据pivot_table方法。 For the aggfunc argument, use len . 对于aggfunc参数,请使用len This will return the count of the items for the provided index and column . 这将返回提供的indexcolumn的项目计数。 Finally, just sum the rows along axis=1 . 最后,只需将axis=1的行相加即可。 Use .to_csv to export. 使用.to_csv导出。

See code: 看到代码:

import pandas as pd

df = pd.DataFrame({
    'sec_Id': ["to", "ti", "tu", "ta", "ty", "te"],
    'sec_Orga': ['CNP', 'COF', 'COF', 'POS', 'POS', 'POS'],
    'sec_Etat': ['Sorti(e)', 'Valide', 'Suspendu(e)', 'Valide', 'Suspendu(e)', 'Suspendu(e)']
})

pivot = df.pivot_table(index='sec_Orga', columns='sec_Etat', aggfunc=len)
pivot["total"] = pivot.sum(axis=1)

print pivot

# pivot.to_csv("p.csv") # Export to CSV file. Uncomment to use.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM