I already asked a related question earlier, but I didn't want to start a comment-and-edit-discussion. So here's -boiled down - what the answer to my earlier question lead me to ask. Consider
import pandas as pd
from numpy import arange
from scipy import random
index = pd.MultiIndex.from_product([arange(0,3), arange(10,15)], names=['A', 'B'])
df = pd.DataFrame(columns=['test'], index=index)
someValues = random.randint(0, 10, size=5)
df.loc[0, 'test']
, df.loc[0,:]
and df.ix[0]
all create a representation of a part of the data frame, the first one being a Series and the other two being df slices. However
df.ix[0] = df.loc[0,'test'] = someValues
sets the value for the df df.loc[0,'test'] = someValues
gives an error ValueError: total size of new array must be unchanged
df.loc[0,:] = someValues
is being ignored. No error, but the df does not contain the numpy array. I skimmed the docs but there was no clear logical and systematical explanation on what is going on with MultiIndexes in general. So far, I guess that "if the view is a Series, you can set values", and "otherwise, god knows what happens".
Could someone shed some light on the logic? Moreover, is there some deep meaning behind this or are these just constraints due to how it is set up?
These are all with 0.13.1
These are not all 'slice' representations at all.
This is a Series.
In [50]: df.loc[0,'test']
Out[50]:
B
10 NaN
11 NaN
12 NaN
13 NaN
14 NaN
Name: test, dtype: object
These are DataFrames (and the same)
In [51]: df.loc[0,:]
Out[51]:
test
B
10 NaN
11 NaN
12 NaN
13 NaN
14 NaN
[5 rows x 1 columns]
In [52]: df.ix[0]
Out[52]:
test
B
10 NaN
11 NaN
12 NaN
13 NaN
14 NaN
[5 rows x 1 columns]
This is trying to assign the wrong shape (it looks like it should work, but if you have multiple columns then it won't, that is why this is not allowed)
In [54]: df.ix[0] = someValues
ValueError: could not broadcast input array from shape (5) into shape (5,1)
This works because it knows how to broadcast
In [56]: df.loc[0,:] = someValues
In [57]: df
Out[57]:
test
A B
0 10 4
11 3
12 4
13 2
14 8
1 10 NaN
11 NaN
12 NaN
13 NaN
14 NaN
2 10 NaN
11 NaN
12 NaN
13 NaN
14 NaN
[15 rows x 1 columns]
This works fine
In [63]: df.loc[0,'test'] = someValues+1
In [64]: df
Out[64]:
test
A B
0 10 5
11 4
12 5
13 3
14 9
1 10 NaN
11 NaN
12 NaN
13 NaN
14 NaN
2 10 NaN
11 NaN
12 NaN
13 NaN
14 NaN
[15 rows x 1 columns]
As does this
In [66]: df.loc[0,:] = someValues+1
In [67]: df
Out[67]:
test
A B
0 10 5
11 4
12 5
13 3
14 9
1 10 NaN
11 NaN
12 NaN
13 NaN
14 NaN
2 10 NaN
11 NaN
12 NaN
13 NaN
14 NaN
[15 rows x 1 columns]
Not clear where you generated the cases in your question. I think the logic is pretty straightforward and consistent (their were several inconsistencies in prior versions however).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.