![](/img/trans.png)
[英]I have a list of 60 complicated items, i have a dataframe with a text column that I would like to extract all the items from the list
[英]I have panda array which was originally a csv file. I would like to remove a specific word from all the rows in the column: text
這是熊貓數組:
id text spam
4016 Subject: re : vacation vince : i just found ... 0
4017 Subject: re : receipts from visit jim , than... 0
4018 Subject: re : enron case study update wow ! a...0
4019 Subject: re : interest david , please , call... 0
4020 Subject: news : aurora 5 . 2 update aurora ve...0
我想從所有行中刪除“文本”列中的“主題”一詞,使其變為:
id text spam
4016 re : vacation vince : i just found ... 0
4017 re : receipts from visit jim , than... 0
4018 re : enron case study update wow ! a...0
4019 re : interest david , please , call... 0
4020 news : aurora 5 . 2 update aurora ve...0
我認為您需要replace
- ^
表示每個字符串的開頭和\\s+
一個或多個空格:
df['text'] = df['text'].replace('^Subject:\s+', '', regex=True)
print (df)
id text spam
0 4016 re : vacation vince : i just found ... 0
1 4017 re : receipts from visit jim , than... 0
2 4018 re : enron case study update wow ! a... 0
3 4019 re : interest david , please , call... 0
4 4020 news : aurora 5 . 2 update aurora ve... 0
但是如果需要刪除前9
字符,包括whitespace
s:
df['text'] = df['text'].str[9:]
嘗試這個:
df.text = df.text.apply(lambda row: row[9:])
每行將在“文本”列處更改,其中前9個字符“主題:”已刪除。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.