I just started using pandas today. I found a tutorial where I can create a table that appears like
foo one two
bar a b c
2 0 0 0
4 0 0 0
6 0 0 0
from the code
import numpy as np
import pandas as pd
arrays = [np.hstack([ ['one']*1, ['two']*2]), ['a', 'b', 'c']]
columns = pd.MultiIndex.from_arrays(arrays, names=['foo', 'bar'])
df = pd.DataFrame(np.zeros((3,3)), columns=columns, index=['2','4','6'])
print df
I am trying to repeat the same thing, but creating the dataframe with a dictionary.
d={'a':[0,0,0], 'b':[0,0,0], 'c':[0,0,0]}
dd = pd.DataFrame(d, columns=columns, index=['2','4','6'])
print dd
However I get
foo one two
bar a b c
2 NaN NaN NaN
4 NaN NaN NaN
6 NaN NaN NaN
Omitting columns=columns
yields a dataframe as expected, but without the multiindexed columns. Any idea on how I can achieve these multiindexed columns in a dataframe created from a dictionary? The docs seem to only cover numpy arrays with multiindexing. I would use numpy, but I was running into problems creating arrays when not every row is of equal length. I was only getting a 1d numpy array. My data will mostly likely be strings if that affects anything.
If you pass a dict with keys 'a', 'b', 'c'
, you're telling it the columns are named 'a', 'b'
, and 'c'
. But your columns aren't named that. If you're using a MultiIndex, your columns don't have a single name, but rather a tuple of names, one for each level. So you need to specify the data with the full tuple for each column:
d={('one', 'a'):[0,0,0], ('two', 'b'):[0,0,0], ('two', 'c'):[0,0,0]}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.