Pandas DataFrame column (Series) has different index than the Dataframe?

Question

Consider this small script:

import pandas as pd

aa = pd.DataFrame({'a': [1,2,3]})
bb = aa.a
bb.index = bb.index + 1
aa['b'] = bb
print(aa)
print(aa.a - aa.b)

the output is:

while I was expecting aa.a - aa.b to be

0    NaN
1    1.0
2    1.0

How is this possible? Is it a Pandas bug?

Answer 1

aa = pd.DataFrame({'a': [1,2,3]})
bb = aa.a
bb.index = bb.index + 1
aa['b'] = bb
aa.reset_index(drop=True)  # add this

your index does not match.

Answer 2

When you do aa.b - aa.a , you're substracting 2 pandas.Series having a same lenght, but not the same index:

aa.a

1    1
2    2
3    3
Name: a, dtype: int64

Where as:

aa.b

0    NaN
1    1.0
2    2.0
Name: b, dtype: float64

And when you do:

print(aa.b - aa.a)

you're printing the merge of these 2 pandas.Series (regardless the operation type: addition or substraction), and that's why the indices [0,1,2] and [1,2,3] will merged to a new index from 0 to 3: [0,1,2,3].

And for instance, if you shift of 2 your bb.index instead of 1:

bb.index = bb.index + 2

that time, you will have 5 rows in your new pandas.Series instead of 4. And so on..

bb.index = bb.index + 2
aa['b'] = bb
print(aa.a - aa.b)

0    NaN
1    NaN
2    0.0
3    NaN
4    NaN
dtype: float64

Answer 3

Use this code to get what you expect:

aa = pd.DataFrame({'a': [1,2,3]})
bb = aa.a.copy()
bb.index = bb.index + 1
aa['b'] = bb
print(aa)
print(aa.a - aa.b)

Pandas DataFrame column (Series) has different index than the Dataframe?

Question

3 answers

solution1
2 2021-09-30 13:06:00

solution2
1 2021-09-30 14:05:24

solution3
0 2022-04-07 20:08:34

Pandas DataFrame column (Series) has different index than the Dataframe?

Question

3 answers

solution1 2 2021-09-30 13:06:00

solution2 1 2021-09-30 14:05:24

solution3 0 2022-04-07 20:08:34

solution1
2 2021-09-30 13:06:00

solution2
1 2021-09-30 14:05:24

solution3
0 2022-04-07 20:08:34