Difference between df.reindex() and df.loc[]

Question

Say I want to compute the relative complement df2 - df1 between two MultiIndex dataframes. Assuming that they have the same indexing schema, based on what I saw in this answer from Andy Hayden, I could do the following:

diff_indices = df2.index - df1.index

And then either:

df2.reindex(diff_indices, inplace=True)

or
df2 = df2.loc[diff_indices]

What would be the difference between 1. and 2. above? What is the difference between df.reindex and df.loc ?

Answer 1

Both approaches return a new series/dataframe, and basically do the same thing.

The reason for the seeming redundancy is that, while using loc is syntacticly limiting (you can only pass a single argument to __getitem__ ), reindex is a method, which supports taking various optional parameters. ( docs )

Difference between df.reindex() and df.loc[]

Question

1 answers

solution1
7 ACCPTED 2014-04-09 18:35:31

Difference between df.reindex() and df.loc[]

Question

1 answers

solution1 7 ACCPTED 2014-04-09 18:35:31

solution1
7 ACCPTED 2014-04-09 18:35:31