[英]Efficiently adding a new column to a Pandas DataFrame with values processed from an existing column?
I have a string column foo
in my DataFrame
. 我的DataFrame
有一个字符串列foo
。 I need to create a new column bar
, whose values are derived from corresponding foo
values by a sequence of string-processing operations - a bunch of str.split()
s and str.join()
s in this particular case. 我需要创建一个新的列bar
,其值是通过一系列字符串处理操作从对应的foo
值派生的-在这种特殊情况下,一堆str.split()
和str.join()
。
What is the most efficient way to do this? 最有效的方法是什么?
Take a look at the vectorized string methods of pandas dataframes. 看一下熊猫数据帧的矢量化字符串方法。 http://pandas.pydata.org/pandas-docs/dev/text.html#text-string-methods http://pandas.pydata.org/pandas-docs/dev/text.html#text-string-methods
# You can call whatever vectorized string methods on the RHS
df['bar'] = df['foo']
eg. 例如。
df = pd.DataFrame(['a c', 'b d'], columns=['foo'])
df['bar'] = df['foo'].str.split(' ').str.join('-')
print(df)
yields 产量
foo bar
0 a c a-c
1 b d b-d
Pandas can do this for you. 熊猫可以为您做到这一点。 A simple example might look like: 一个简单的示例可能如下所示:
foo = ["this", "is an", "example!"]
df = pd.DataFrame({'foo':foo})
df['upper_bar'] = df.foo.str.upper()
df['lower_bar'] = df.foo.str.lower()
df['split_bar'] = df.foo.str.split('_')
print(df)
which will give you 这会给你
foo upper_bar lower_bar split_bar
0 this THIS this [this]
1 is an IS AN is an [is an]
2 example! EXAMPLE! example! [example!]
See the link above from Alex 请参阅上面来自Alex的链接
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.