If I create a dataframe and than generate a pivot table from it, it keeps appearing a string in the upper left "cell" of the resulting table, like below. In this example it appears the string "n":
import pandas as pd
df = pd.DataFrame({'col1':['a','a','b','b','c','c'],
'col2':['str_a1','str_a2','str_b1','str_b2','str_c1','str_c2']})
df2 = df.assign(n=df.groupby('col1').cumcount()).pivot(index='col1',columns='n',values='col2').reset_index()
df2
n col1 0 1
0 a str_a1 str_a2
1 b str_b1 str_b2
2 c str_c1 str_c2
If I create the dataframe directly like below, it appears nothing. How can I include the "n" in this second option and how can I remove the "n" from the option above?
df3 = pd.DataFrame({'col1':['a','b','c'],
'0':['str_a1','str_b1','str_c1'],
'1':['srt_a2','str_b2','str_c2']})
df3
col1 0 1
0 a str_a1 srt_a2
1 b str_b1 str_b2
2 c str_c1 str_c2
I got the answer by 'looking' at the dataframe 'horizontally' instead of 'vertically'. The 'n' that I was mentioning above was not the index name as splash58 pointed out. I must say that I used to think this way.
Than I noticed that the 'n' was in the same line as the other columns names's. Therefore it must be the name of the columns index.
In fact, if you do:
import pandas as pd
df = pd.DataFrame({'col1':['a','a','b','b','c','c'],
'col2':['str_a1','str_a2','str_b1','str_b2','str_c1','str_c2']})
df2 = df.assign(n=df.groupby('col1').cumcount()).pivot(index='col1',columns='n',values='col2').reset_index()
print(df2)
you get:
n col1 0 1
0 a str_a1 str_a2
1 b str_b1 str_b2
2 c str_c1 str_c2
After this, if you do:
df2.columns.name
you get:
'n'
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.