简体   繁体   中英

How to add multilevel column name to specific column only(not all the columns) in python pandas.DataFrame?

Refer here for the question background. I want to add C to only column B .

I need output as:

 df
    Out[92]: 
       A  B
          C
    a  0  0
    b  1  1
    c  2  2
    d  3  3
    e  4  4

I tried this example as :

dfnew=pd.DataFrame({'a':[1,2,3],'b':[4,5,6]})

columns=[('c','b')]  #changed from columns=[('c','a'),('c','b')]

dfnew.columns=pd.MultiIndex.from_tuples(columns)

But that doesn't works. ValueError: Length mismatch: Expected axis has 2 elements, new values have 1 elements

You can use MultiIndex.from_arrays :

df.columns = pd.MultiIndex.from_arrays([df.columns, ['','C']])

   A  B
      C
a  0  0
b  1  1
c  2  2
d  3  3
e  4  4

Note that pd.MultiIndex.from_tuples is expecting a list of tuples, as the name suggests. If you check the source code, you'll see that if that is not the case, it will create one from the nested list by zipping it:

list(zip(*[df.columns, ['','C']]))
# [('A', ''), ('B', 'C')]

Which is the reason why you don't get what you expect.


If you want to do the same by specifying a list of columns, you could do:

cols = [(i, 'C') if i in ['B','D'] else (i, '') for i in df.columns]
# [('A', ''), ('B', 'C'), ('C', ''), ('D', 'C')]
df.columns = pd.MultiIndex.from_tuples(cols)

   A  B  C  D
      C     C
a  0  0  0  0
b  1  1  1  1
c  2  2  2  2
d  3  3  3  3
e  4  4  4  4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM