So I know I can add a new column trivially in Pandas like this:
df
=====
A
1 5
2 6
3 7
df['new_col'] = "text"
df
====
A new_col
1 5 text
2 6 text
3 7 text
And I can also set a new column based on an operation on an existing column.
def times_two(x):
return x * 2
df['newer_col'] = time_two(df.a)
df
====
A new_col newer_col
1 5 text 10
2 6 text 12
3 7 text 14
however when I try to operate on a text column I get an unexpected AttributeError.
df['new_text'] = df['new_col'].upper()
AttributeError: 'Series' object has no attribute 'upper'
It is now treating the value as a series, not the value in that "cell".
Why does this happen with text and not with numbers and how can update my DF with a new column based on an existing text column?
It's because the *
operator is implemented as a mul
operator whilst upper
isn't defined for a Series
. You have to use str.upper
which is implemented for a Series
where the dtype is str
:
In[53]:
df['new_text'] = df['new_col'].str.upper()
df
Out[53]:
A new_col new_text
1 5 text TEXT
2 6 text TEXT
3 7 text TEXT
There is no magic here.
For df['new_col']
this is just assigning a scalar value and conforming to broadcasting
rules, where the scalar is broadcast to the length of the df along the minor axis, see this for an explanation of that: What does the term "broadcasting" mean in Pandas documentation?
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.