Python pandas：刪除字符串中分隔符后的所有內容

Question

我有包含例如的數據框：

"vendor a::ProductA"
"vendor b::ProductA"
"vendor a::Productb"

我需要刪除所有內容（包括）這兩個 :: 以便我最終得到：

"vendor a"
"vendor b"
"vendor a"

我嘗試了 str.trim （似乎不存在）和 str.split 沒有成功。 完成此任務的最簡單方法是什么？

Answer 1

你可以像使用split一樣使用pandas.Series.str.split 。 只需拆分字符串'::' ，並索引從split方法創建的列表：

>>> df = pd.DataFrame({'text': ["vendor a::ProductA", "vendor b::ProductA", "vendor a::Productb"]})
>>> df
                 text
0  vendor a::ProductA
1  vendor b::ProductA
2  vendor a::Productb
>>> df['text_new'] = df['text'].str.split('::').str[0]
>>> df
                 text  text_new
0  vendor a::ProductA  vendor a
1  vendor b::ProductA  vendor b
2  vendor a::Productb  vendor a

這是一個非熊貓的解決方案：

>>> df['text_new1'] = [x.split('::')[0] for x in df['text']]
>>> df
                 text  text_new text_new1
0  vendor a::ProductA  vendor a  vendor a
1  vendor b::ProductA  vendor b  vendor b
2  vendor a::Productb  vendor a  vendor a

編輯：這是上面pandas中發生的事情的分步說明：

# Select the pandas.Series object you want
>>> df['text']
0    vendor a::ProductA
1    vendor b::ProductA
2    vendor a::Productb
Name: text, dtype: object

# using pandas.Series.str allows us to implement "normal" string methods 
# (like split) on a Series
>>> df['text'].str
<pandas.core.strings.StringMethods object at 0x110af4e48>

# Now we can use the split method to split on our '::' string. You'll see that
# a Series of lists is returned (just like what you'd see outside of pandas)
>>> df['text'].str.split('::')
0    [vendor a, ProductA]
1    [vendor b, ProductA]
2    [vendor a, Productb]
Name: text, dtype: object

# using the pandas.Series.str method, again, we will be able to index through
# the lists returned in the previous step
>>> df['text'].str.split('::').str
<pandas.core.strings.StringMethods object at 0x110b254a8>

# now we can grab the first item in each list above for our desired output
>>> df['text'].str.split('::').str[0]
0    vendor a
1    vendor b
2    vendor a
Name: text, dtype: object

我建議查看pandas.Series.str 文檔，或者更好的是，使用 Pandas中的文本數據。

Answer 2

如果它位於數據框（名稱：數據框）的特定列（名稱：列）中，您還可以使用

dataframe.column.str.replace("(::).*","")

它為您提供以下結果

         column        new_column       
0  vendor a::ProductA  vendor a
1  vendor b::ProductA  vendor b
2  vendor a::Productb  vendor a

通過使用它，您無需指定任何位置，因為它消除了 ' :: ' 之后的任何內容

我想這可能會來哦幫助，祝你好運！

Answer 3

您可以使用str.replace(":", " ")刪除"::" 。 要拆分，需要指定要拆分成的字符： str.split(" ")

修剪函數在python中稱為strip： str.strip()

此外，您可以執行str[:7]在字符串中僅獲取"vendor x" 。

祝你好運

Answer 4

或者，您可以使用extract來返回括號內的字符串部分：

In [3]: df.assign(result=df['column'].str.extract('(.*)::'))
Out[3]: 
               column    result
0  vendor a::ProductA  vendor a
1  vendor b::ProductA  vendor b
2  vendor a::Productb  vendor a

Answer 5

我有包含例如的數據框：

"vendor a::ProductA"
"vendor b::ProductA"
"vendor a::Productb"

我需要刪除所有（包括）兩個 :: 以便我最終得到：

"vendor a"
"vendor b"
"vendor a"

我試過 str.trim （似乎不存在）和 str.split 沒有成功。 實現這一目標的最簡單方法是什么？

Python pandas：刪除字符串中分隔符后的所有內容

問題描述

4 個解決方案

解決方案1
102 已采納 2016-11-20 15:03:43

解決方案2
8 2020-02-02 06:22:32

解決方案3
4 2016-11-20 15:02:31

解決方案4
0 2022-06-26 16:11:43

解決方案5
-5 2016-11-20 15:06:32

Python pandas：刪除字符串中分隔符后的所有內容

問題描述

4 個解決方案

解決方案1 102 已采納 2016-11-20 15:03:43

解決方案2 8 2020-02-02 06:22:32

解決方案3 4 2016-11-20 15:02:31

解決方案4 0 2022-06-26 16:11:43

解決方案5 -5 2016-11-20 15:06:32

解決方案1
102 已采納 2016-11-20 15:03:43

解決方案2
8 2020-02-02 06:22:32

解決方案3
4 2016-11-20 15:02:31

解決方案4
0 2022-06-26 16:11:43

解決方案5
-5 2016-11-20 15:06:32