[英]Using Pivot_Table with Pandas
I am using Pandas with Python 3.7.我在 Python 3.7 中使用 Pandas。 I have a csv file with 8 columns, and I need to pivot columns 5, 6, 7, and 8. I pivot them like so:我有一个包含 8 列的 csv 文件,我需要旋转第 5、6、7 和 8 列。我像这样旋转它们:
pivot = pd.pivot_table (csv_file,
values = ["column 5", "column 6" , "column 7","column 8"],
index = "ID",
aggfunc = np.mean,
dropna = False)
I then save it to a csv.然后我将它保存到一个csv。 The problem is that the saved csv file only has columns 5, 6, 7, and 8. I want it to keep the first 4 columns as well.问题是保存的 csv 文件只有第 5、6、7 和 8 列。我希望它也保留前 4 列。 I have tried to find workarounds, such as dropping the last 4 columns from csv_file, then adding the columns back after they are pivoted.我试图找到解决方法,例如从 csv_file 中删除最后 4 列,然后在旋转后重新添加这些列。 I have tried concatenating, joining, and merging, with no success in recreating a csv file with all 8 columns after pivoting.我尝试连接、加入和合并,但在旋转后重新创建包含所有 8 列的 csv 文件没有成功。 If anyone has any insight or knowledge on how to perform this task I would greatly appreciate the help如果有人对如何执行此任务有任何见解或知识,我将不胜感激
I dont have your csv_file as input but I tried to do something like this.我没有你的 csv_file 作为输入,但我试图做这样的事情。 See if this is what you are looking for.看看这是否是您要找的。
c1 = [i for i in range(10)]
c2 = [i for i in range(10)]
c3 = [str(i) for i in range (10)]
c4 = [str(i) for i in range (10,0,-1)]
c5 = ['test']*10
import pandas as pd
import numpy as np
df = pd.DataFrame({'column1':c1,'column2':c2,
'column3':c3,'column4':c4,'column5':c5,})
df_pivot=pd.pivot_table(df,
values=['column1', 'column2'],
index=["column3","column4","column5"],
aggfunc={'column1':[np.mean,max,min],'column2':np.sum})
#df_pivot.columns = df_pivot.columns.get_level_values(1)
df_pivot.columns = ['_'.join(col).strip() for col in df_pivot.columns.values]
print(df)
print(df_pivot)
The outputs will be as follows:输出如下:
Initial dataframe:初始数据帧:
column1 column2 column3 column4 column5
0 0 0 0 10 test
1 1 1 1 9 test
2 2 2 2 8 test
3 3 3 3 7 test
4 4 4 4 6 test
5 5 5 5 5 test
6 6 6 6 4 test
7 7 7 7 3 test
8 8 8 8 2 test
9 9 9 9 1 test
Pivot Dataframe枢轴数据框
column1_max column1_mean column1_min column2_sum
column3 column4 column5
0 10 test 0 0 0 0
1 9 test 1 1 1 1
2 8 test 2 2 2 2
3 7 test 3 3 3 3
4 6 test 4 4 4 4
5 5 test 5 5 5 5
6 4 test 6 6 6 6
7 3 test 7 7 7 7
8 2 test 8 8 8 8
9 1 test 9 9 9 9
I got a few minutes to work on your problem statement.我有几分钟的时间来处理你的问题陈述。 You need to call out which columns to write to your file.您需要调出要写入文件的列。 Here's what I did.这就是我所做的。
Writing to a CSV file写入 CSV 文件
df_pivot.to_csv('out.csv', index_label=df_pivot.columns.name)
The output of this will be as follows.其输出如下。 I hope this is what you are looking for.我希望这就是你正在寻找的。
column3,column4,column5,column1_max,column1_mean,column1_min,column2_sum
0,10,test,0,0,0,0
1,9,test,1,1,1,1
2,8,test,2,2,2,2
3,7,test,3,3,3,3
4,6,test,4,4,4,4
5,5,test,5,5,5,5
6,4,test,6,6,6,6
7,3,test,7,7,7,7
8,2,test,8,8,8,8
9,1,test,9,9,9,9
Writing to an Excel file写入 Excel 文件
import openpyxl
with pd.ExcelWriter('myfile.xlsx',engine="openpyxl") as writer:
df_pivot.to_excel(writer, sheet_name='Sheet1',columns=df_pivot.columns)
The answer I got is as follows:我得到的答案如下:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.