So I have the following DataFrame
within pandas:
Column 1 | Column 2
Name | A
Number | B
Age | C
Name | D
Number | E
Age | F
Each Name, Number and Age grouped togther all relate to one feature and are repeated throughout the dateframe. I am and wondering what the best method would be to get it in the following format?:
Name | Number | Age
Feature 1 A | B | C
Feature 2 D | E | F
Any help would be appreciated as I'm stumped as to what function or method I would use!
This is a pivot, but you first need to create a label to group the sets of 3 rows together. If the data are clean enough such that the DataFrame is always ordered Name, Number, Age, Name, Number, Age, ..., you can cumsum
a Boolean Series checking which rows are 'Name' to group them together.
df['index'] = 'Feature ' + df['Column 1'].eq('Name').cumsum().astype(str)
# Column 1 Column 2 index
#0 Name A Feature 1
#1 Number B Feature 1
#2 Age C Feature 1
#3 Name D Feature 2
#4 Number E Feature 2
#5 Age F Feature 2
df = (df.pivot(index='index', columns='Column 1', values='Column 2')
.rename_axis(index=None, columns=None))
# Age Name Number
#Feature 1 C A B
#Feature 2 F D E
Alternatively you could group every three rows together with integer division based on the length of the DataFrame.
import numpy as np
df['index'] = np.char.add(['Feature '], (np.arange(len(df))//3+1).astype(str))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.