简体   繁体   English

在 Pandas 中使用 Pivot_Table

[英]Using Pivot_Table with Pandas

I am using Pandas with Python 3.7.我在 Python 3.7 中使用 Pandas。 I have a csv file with 8 columns, and I need to pivot columns 5, 6, 7, and 8. I pivot them like so:我有一个包含 8 列的 csv 文件,我需要旋转第 5、6、7 和 8 列。我像这样旋转它们:

pivot = pd.pivot_table (csv_file, 
                        values = ["column 5", "column 6" , "column 7","column 8"],
                        index = "ID",
                        aggfunc = np.mean,
                        dropna = False)

I then save it to a csv.然后我将它保存到一个csv。 The problem is that the saved csv file only has columns 5, 6, 7, and 8. I want it to keep the first 4 columns as well.问题是保存的 csv 文件只有第 5、6、7 和 8 列。我希望它也保留前 4 列。 I have tried to find workarounds, such as dropping the last 4 columns from csv_file, then adding the columns back after they are pivoted.我试图找到解决方法,例如从 csv_file 中删除最后 4 列,然后在旋转后重新添加这些列。 I have tried concatenating, joining, and merging, with no success in recreating a csv file with all 8 columns after pivoting.我尝试连接、加入和合并,但在旋转后重新创建包含所有 8 列的 csv 文件没有成功。 If anyone has any insight or knowledge on how to perform this task I would greatly appreciate the help如果有人对如何执行此任务有任何见解或知识,我将不胜感激

I dont have your csv_file as input but I tried to do something like this.我没有你的 csv_file 作为输入,但我试图做这样的事情。 See if this is what you are looking for.看看这是否是您要找的。

c1 = [i for i in range(10)]
c2 = [i for i in range(10)]
c3 = [str(i) for i in range (10)]
c4 = [str(i) for i in range (10,0,-1)]
c5 = ['test']*10

import pandas as pd
import numpy as np

df = pd.DataFrame({'column1':c1,'column2':c2,
                   'column3':c3,'column4':c4,'column5':c5,})

df_pivot=pd.pivot_table(df,
                  values=['column1', 'column2'], 
                  index=["column3","column4","column5"], 
                  aggfunc={'column1':[np.mean,max,min],'column2':np.sum})

#df_pivot.columns = df_pivot.columns.get_level_values(1)

df_pivot.columns = ['_'.join(col).strip() for col in df_pivot.columns.values]

print(df)
print(df_pivot)

The outputs will be as follows:输出如下:

Initial dataframe:初始数据帧:

   column1  column2 column3 column4 column5
0        0        0       0      10    test
1        1        1       1       9    test
2        2        2       2       8    test
3        3        3       3       7    test
4        4        4       4       6    test
5        5        5       5       5    test
6        6        6       6       4    test
7        7        7       7       3    test
8        8        8       8       2    test
9        9        9       9       1    test

Pivot Dataframe枢轴数据框

                         column1_max  column1_mean  column1_min  column2_sum
column3 column4 column5                                                     
0       10      test               0             0            0            0
1       9       test               1             1            1            1
2       8       test               2             2            2            2
3       7       test               3             3            3            3
4       6       test               4             4            4            4
5       5       test               5             5            5            5
6       4       test               6             6            6            6
7       3       test               7             7            7            7
8       2       test               8             8            8            8
9       1       test               9             9            9            9

I got a few minutes to work on your problem statement.我有几分钟的时间来处理你的问题陈述。 You need to call out which columns to write to your file.您需要调出要写入文件的列。 Here's what I did.这就是我所做的。

Writing to a CSV file写入 CSV 文件

df_pivot.to_csv('out.csv', index_label=df_pivot.columns.name)

The output of this will be as follows.其输出如下。 I hope this is what you are looking for.我希望这就是你正在寻找的。

column3,column4,column5,column1_max,column1_mean,column1_min,column2_sum
0,10,test,0,0,0,0
1,9,test,1,1,1,1
2,8,test,2,2,2,2
3,7,test,3,3,3,3
4,6,test,4,4,4,4
5,5,test,5,5,5,5
6,4,test,6,6,6,6
7,3,test,7,7,7,7
8,2,test,8,8,8,8
9,1,test,9,9,9,9

Writing to an Excel file写入 Excel 文件

import openpyxl

with pd.ExcelWriter('myfile.xlsx',engine="openpyxl") as writer:
    df_pivot.to_excel(writer, sheet_name='Sheet1',columns=df_pivot.columns)

The answer I got is as follows:我得到的答案如下:

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM