Insert NA in specific row/column pandas

Question

I want to place NA's in certain row/column positions.

df_test = pd.DataFrame(np.random.randn(5, 3), 
                  index=['a', 'b', 'c', 'd', 'e'],
                  columns=['one', 'two', 'three'])

rows = pd.Series([True, False, False, False, True], index = df_test.index)

I want to add in the NA's for the rows specified and without column 'two'. I tried this:

df_test[rows].drop(['two'], axis = 1) = np.nan

But this returns error:

SyntaxError: can't assign to function call

Answer 1

This is not going to work, Python simply does not support this kind of syntax, ie, assigning to function calls. Furthermore, drop returns a copy, so dropping the column and operating on the returned DataFrame does not modify the original.

Below are a couple of alternatives you may work with.

`loc` + `pd.Index.difference`

Here, you'll want loc based assignment:

df_test.loc[rows, df_test.columns.difference(['two'])] = np.nan
df_test

        one       two     three
a       NaN  0.205799       NaN
b  0.296389 -0.508247  0.026844
c  0.970879 -0.549491 -0.056991
d -1.474168 -1.694579  1.493165
e       NaN -0.159641       NaN

loc works in-place, modifying the original DataFrame as you want. You can also replace df_test.columns.difference(['two']) with ['one', 'three'] if you so please.

`df.set_value`

For older pandas versions, you can use df.set_value (not in-place)—

df_test.set_value(df_test.index[rows], df_test.columns.difference(['two']), np.nan)
        one       two     three
a       NaN  1.562233       NaN
b -0.755127 -0.862368 -0.850386
c -0.193353 -0.033097  1.005041
d -1.679028  1.006895 -0.206164
e       NaN -1.376300       NaN

Insert NA in specific row/column pandas

Question

1 answers

solution1
3 ACCPTED 2018-05-15 04:51:40

`loc` + `pd.Index.difference`

`df.set_value`

Insert NA in specific row/column pandas

Question

1 answers

solution1 3 ACCPTED 2018-05-15 04:51:40

loc + pd.Index.difference

df.set_value

solution1
3 ACCPTED 2018-05-15 04:51:40

`loc` + `pd.Index.difference`

`df.set_value`