簡體   English   中英

我有熊貓數組,最初是一個csv文件。 我想從該列的所有行中刪除一個特定的單詞:text

[英]I have panda array which was originally a csv file. I would like to remove a specific word from all the rows in the column: text

這是熊貓數組:

id          text                                       spam
4016        Subject: re : vacation vince : i just found ... 0
4017        Subject: re : receipts from visit jim , than... 0
4018        Subject: re : enron case study update wow ! a...0
4019        Subject: re : interest david , please , call... 0
4020        Subject: news : aurora 5 . 2 update aurora ve...0

我想從所有行中刪除“文本”列中的“主題”一詞,使其變為:

id          text                                       spam
4016        re : vacation vince : i just found ...  0
4017        re : receipts from visit jim , than...  0
4018        re : enron case study update wow ! a...0
4019        re : interest david , please , call...  0
4020        news : aurora 5 . 2 update aurora ve...0

我認為您需要replace - ^表示每個字符串的開頭和\\s+一個或多個空格:

df['text'] = df['text'].replace('^Subject:\s+', '', regex=True)
print (df)
     id                                     text  spam
0  4016   re : vacation vince : i just found ...     0
1  4017   re : receipts from visit jim , than...     0
2  4018  re : enron case study update wow ! a...     0
3  4019   re : interest david , please , call...     0
4  4020  news : aurora 5 . 2 update aurora ve...     0

但是如果需要刪除前9字符,包括whitespace s:

df['text'] = df['text'].str[9:]

嘗試這個:

df.text = df.text.apply(lambda row: row[9:])

每行將在“文本”列處更改,其中前9個字符“主題:”已刪除。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM