有效地將新列添加到Pandas DataFrame中，並使用從現有列中處理的值？

Question

我的DataFrame有一個字符串列foo 。 我需要創建一個新的列bar ，其值是通過一系列字符串處理操作從對應的foo值派生的-在這種特殊情況下，一堆str.split()和str.join() 。

最有效的方法是什么？

Answer 1

看一下熊貓數據幀的矢量化字符串方法。 http://pandas.pydata.org/pandas-docs/dev/text.html#text-string-methods

# You can call whatever vectorized string methods on the RHS
df['bar'] = df['foo']

例如。

df = pd.DataFrame(['a c', 'b d'], columns=['foo'])
df['bar'] = df['foo'].str.split(' ').str.join('-')
print(df)

產量

   foo  bar
0  a c  a-c
1  b d  b-d

Answer 2

熊貓可以為您做到這一點。 一個簡單的示例可能如下所示：

foo = ["this", "is an", "example!"]

df = pd.DataFrame({'foo':foo})
df['upper_bar'] = df.foo.str.upper()
df['lower_bar'] = df.foo.str.lower()
df['split_bar'] = df.foo.str.split('_')
print(df)

這會給你

       foo   upper_bar  lower_bar   split_bar
0      this      THIS      this      [this]
1     is an     IS AN     is an     [is an]
2  example!  EXAMPLE!  example!  [example!]

請參閱上面來自Alex的鏈接

有效地將新列添加到Pandas DataFrame中，並使用從現有列中處理的值？

問題描述

2 個解決方案

解決方案1
1 已采納 2015-03-09 17:54:26

解決方案2
1 2015-03-09 17:59:11

有效地將新列添加到Pandas DataFrame中，並使用從現有列中處理的值？

問題描述

2 個解決方案

解決方案1 1 已采納 2015-03-09 17:54:26

解決方案2 1 2015-03-09 17:59:11

解決方案1
1 已采納 2015-03-09 17:54:26

解決方案2
1 2015-03-09 17:59:11