在 Pandas DataFrame 中添加新列时结果不一致。它是一个系列还是一个值？

Question

So I know I can add a new column trivially in Pandas like this:所以我知道我可以像这样在 Pandas 中简单地添加一个新列：

df
=====
  A
1 5
2 6
3 7

df['new_col'] = "text"

df
====
  A    new_col
1 5    text
2 6    text
3 7    text

And I can also set a new column based on an operation on an existing column.我还可以根据对现有列的操作设置一个新列。

def times_two(x):
    return x * 2

df['newer_col'] = time_two(df.a)
df
====
  A    new_col   newer_col
1 5    text      10
2 6    text      12
3 7    text      14

however when I try to operate on a text column I get an unexpected AttributeError.但是，当我尝试对文本列进行操作时，出现意外的 AttributeError。

df['new_text'] = df['new_col'].upper()
AttributeError: 'Series' object has no attribute 'upper'

It is now treating the value as a series, not the value in that "cell".它现在将值视为一个系列，而不是该“单元格”中的值。

Why does this happen with text and not with numbers and how can update my DF with a new column based on an existing text column?为什么这种情况发生在文本而不是数字上，以及如何使用基于现有文本列的新列更新我的 DF？

Answer 1

It's because the * operator is implemented as a mul operator whilst upper isn't defined for a Series .这是因为*运算符是作为mul运算符实现的，而upper不是为Series定义的。 You have to use str.upper which is implemented for a Series where the dtype is str :您必须使用为str.upper为str的Series实现的str.upper ：

In[53]:
df['new_text'] = df['new_col'].str.upper()
df

Out[53]: 
   A new_col new_text
1  5    text     TEXT
2  6    text     TEXT
3  7    text     TEXT

There is no magic here.这里没有魔法。

For df['new_col'] this is just assigning a scalar value and conforming to broadcasting rules, where the scalar is broadcast to the length of the df along the minor axis, see this for an explanation of that: What does the term "broadcasting" mean in Pandas documentation?对于df['new_col']这只是分配一个标量值并符合broadcasting规则，其中标量沿短轴广播到 df 的长度，请参阅此说明：什么是术语“广播” " 在 Pandas 文档中是什么意思？

在 Pandas DataFrame 中添加新列时结果不一致。它是一个系列还是一个值？

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-04-12 15:07:18

在 Pandas DataFrame 中添加新列时结果不一致。 它是一个系列还是一个值？

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-04-12 15:07:18

在 Pandas DataFrame 中添加新列时结果不一致。它是一个系列还是一个值？

解决方案1
1 已采纳 2019-04-12 15:07:18