有效地将新列添加到Pandas DataFrame中，并使用从现有列中处理的值？

Question

I have a string column foo in my DataFrame . 我的DataFrame有一个字符串列foo 。 I need to create a new column bar , whose values are derived from corresponding foo values by a sequence of string-processing operations - a bunch of str.split() s and str.join() s in this particular case. 我需要创建一个新的列bar ，其值是通过一系列字符串处理操作从对应的foo值派生的-在这种特殊情况下，一堆str.split()和str.join() 。

What is the most efficient way to do this? 最有效的方法是什么？

Answer 1

Take a look at the vectorized string methods of pandas dataframes. 看一下熊猫数据帧的矢量化字符串方法。 http://pandas.pydata.org/pandas-docs/dev/text.html#text-string-methods http://pandas.pydata.org/pandas-docs/dev/text.html#text-string-methods

# You can call whatever vectorized string methods on the RHS
df['bar'] = df['foo']

eg. 例如。

df = pd.DataFrame(['a c', 'b d'], columns=['foo'])
df['bar'] = df['foo'].str.split(' ').str.join('-')
print(df)

yields 产量

   foo  bar
0  a c  a-c
1  b d  b-d

Answer 2

Pandas can do this for you. 熊猫可以为您做到这一点。 A simple example might look like: 一个简单的示例可能如下所示：

foo = ["this", "is an", "example!"]

df = pd.DataFrame({'foo':foo})
df['upper_bar'] = df.foo.str.upper()
df['lower_bar'] = df.foo.str.lower()
df['split_bar'] = df.foo.str.split('_')
print(df)

which will give you 这会给你

       foo   upper_bar  lower_bar   split_bar
0      this      THIS      this      [this]
1     is an     IS AN     is an     [is an]
2  example!  EXAMPLE!  example!  [example!]

See the link above from Alex 请参阅上面来自Alex的链接

有效地将新列添加到Pandas DataFrame中，并使用从现有列中处理的值？

问题描述

2 个解决方案

解决方案1
1 已采纳 2015-03-09 17:54:26

解决方案2
1 2015-03-09 17:59:11

有效地将新列添加到Pandas DataFrame中，并使用从现有列中处理的值？

问题描述

2 个解决方案

解决方案1 1 已采纳 2015-03-09 17:54:26

解决方案2 1 2015-03-09 17:59:11

解决方案1
1 已采纳 2015-03-09 17:54:26

解决方案2
1 2015-03-09 17:59:11