[英]Pandas DataFrame column (Series) has different index than the Dataframe?
Consider this small script:考虑这个小脚本:
import pandas as pd
aa = pd.DataFrame({'a': [1,2,3]})
bb = aa.a
bb.index = bb.index + 1
aa['b'] = bb
print(aa)
print(aa.a - aa.b)
the output is: output 是:
a b
0 1 NaN
1 2 1.0
2 3 2.0
0 NaN
1 0.0
2 0.0
3 NaN
while I was expecting aa.a - aa.b
to be虽然我期待
aa.a - aa.b
是
0 NaN
1 1.0
2 1.0
How is this possible?这怎么可能? Is it a Pandas bug?
是 Pandas 错误吗?
aa = pd.DataFrame({'a': [1,2,3]})
bb = aa.a
bb.index = bb.index + 1
aa['b'] = bb
aa.reset_index(drop=True) # add this
your index does not match.您的索引不匹配。
When you do aa.b - aa.a
, you're substracting 2 pandas.Series
having a same lenght, but not the same index:当您执行
aa.b - aa.a
时,您将减去具有相同长度但索引不同的 2 pandas.Series
:
aa.a
1 1
2 2
3 3
Name: a, dtype: int64
Where as:然而:
aa.b
0 NaN
1 1.0
2 2.0
Name: b, dtype: float64
And when you do:当你这样做时:
print(aa.b - aa.a)
you're printing the merge of these 2 pandas.Series
(regardless the operation type: addition or substraction), and that's why the indices [0,1,2]
and [1,2,3]
will merged to a new index from 0 to 3: [0,1,2,3].您正在打印这些 2
pandas.Series
的合并(无论操作类型:加法还是减法),这就是索引[0,1,2]
和[1,2,3]
将合并到新索引的原因0 到 3:[0,1,2,3]。
And for instance, if you shift of 2 your bb.index
instead of 1:例如,如果您将
bb.index
移 2 而不是 1:
bb.index = bb.index + 2
that time, you will have 5 rows in your new pandas.Series
instead of 4. And so on..到那时,您的新
pandas.Series
中将有 5 行,而不是 4 行。依此类推。
bb.index = bb.index + 2
aa['b'] = bb
print(aa.a - aa.b)
0 NaN
1 NaN
2 0.0
3 NaN
4 NaN
dtype: float64
Use this code to get what you expect:使用此代码获得您期望的结果:
aa = pd.DataFrame({'a': [1,2,3]})
bb = aa.a.copy()
bb.index = bb.index + 1
aa['b'] = bb
print(aa)
print(aa.a - aa.b)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.