简体   繁体   English

在PANDAS中每第n行将数据转置到列中

[英]Transpose the data in a column every nth rows in PANDAS

For a research project, I need to process every individual's information from the website into an excel file. 对于研究项目,我需要将网站上每个人的信息都处理成一个excel文件。 I have copied and pasted everything I need from the website onto a single column in an excel file, and I loaded that file using PANDAS. 我已经将网站上需要的所有内容复制并粘贴到excel文件的单个列中,然后使用PANDAS加载了该文件。 However, I need to present each individual's information horizontally instead of vertically like it is now. 但是,我需要水平显示每个人的信息,而不是像现在这样垂直显示信息。 For example, this is what I have right now. 例如,这就是我现在所拥有的。 I only have one column of unorganized data. 我只有一列无组织的数据。

df= pd.read_csv("ior work.csv", encoding = "ISO-8859-1")

Data: 数据:

0 Andrew
1 School of Music
2 Music: Sound of the wind
3 Dr. Seuss
4 Dr.Sass
5 Michelle
6 School of Theatrics
7 Music: Voice
8 Dr. A
9 Dr. B

I want transpose every 5 lines to organize the data into this organizational format; 我想每5行换位以将数据组织成这种组织格式; the labels below are labels of the columns. 下面的标签是列的标签。

Name School Music Mentor1 Mentor2

What is the most efficient way to do this? 最有效的方法是什么?

If no data are missing, you can use numpy.reshape : 如果没有数据丢失,可以使用numpy.reshape

print (np.reshape(df.values,(2,5)))
[['Andrew' 'School of Music' 'Music: Sound of the wind' 'Dr. Seuss'
  'Dr.Sass']
 ['Michelle' 'School of Theatrics' 'Music: Voice' 'Dr. A' 'Dr. B']]

print (pd.DataFrame(np.reshape(df.values,(2,5)), 
                    columns=['Name','School','Music','Mentor1','Mentor2']))
       Name               School                     Music    Mentor1  Mentor2
0    Andrew      School of Music  Music: Sound of the wind  Dr. Seuss  Dr.Sass
1  Michelle  School of Theatrics              Music: Voice      Dr. A    Dr. B

More general solution with generating length of new array by shape divide by number of columns: 通过按shape除以列数生成新arraylength的更通用的解决方案:

print (pd.DataFrame(np.reshape(df.values,(df.shape[0] / 5,5)), 
                    columns=['Name','School','Music','Mentor1','Mentor2']))
       Name               School                     Music    Mentor1  Mentor2
0    Andrew      School of Music  Music: Sound of the wind  Dr. Seuss  Dr.Sass
1  Michelle  School of Theatrics              Music: Voice      Dr. A    Dr. B

Thank you piRSquared for another solution: 谢谢piRSquared提供了另一个解决方案:

print (pd.DataFrame(df.values.reshape(-1, 5), 
                    columns=['Name','School','Music','Mentor1','Mentor2']))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM