[英]replace multiple words in a dataframe
我想替換此處描述的單詞,但要替換數據框中的一列。 我還想在數據框中保留原始列和其他列。
a = ["isn't", "can't"]
b = ["is not", "cannot"]
for line in df['text']:
for a1, b1 in zip(a, b):
line = line.replace(a1, b1)
df['text1'].write(line)
TypeError: expected str, bytes or os.PathLike object, not Series
輸入數據框
ID text
1 isn't bad
2 can't play
輸出
ID text text1
1 isn't bad is not bad
2 can't play cannot play
請幫忙。 謝謝你。
如果您有兩個列表a
和b
,那么這將是通過傳遞regex=True
來.replace
值的最佳方法:
a = ["isn't", "can't"]
b = ["is not", "cannot"]
# df=pd.read_clipboard('\s\s+')
df['text1'] = df['text'].replace(a,b,regex=True)
df
Out[68]:
ID text text1
0 1 isn't bad is not bad
1 2 can't play cannot play
請注意a
和b
的長度應該相同。 如果它只是一個小列表,這種技術很好,但如果它是一個更大的列表,您可能想要構建一個字典。
將數據框列上的apply
方法與lambda
函數結合使用,您可以實現這一點,如下所示:
import pandas as pd
a = ["isn't", "can't"]
b = ['is not', 'cannot']
df = pd.DataFrame({'id': [1,2], 'text': ["isn't bad", "can't play"]})
df['a'], df['b'] = a,b
print(df.head())
數據框如下所示:
id text a b
0 1 isn't bad isn't is not
1 2 can't play can't cannot
您現在可以像這樣對這個數據框進行apply
:
df['vals'] = pd.Series(map(lambda x,y,z: x.replace(y, z), list(df.text), list(df.a), list(df.b)))
print(df.head())
最終輸出:
id text a b vals
0 1 isn't bad isn't is not is not bad
1 2 can't play can't cannot cannot play
您可以考慮使用vals
列進行分析或僅提取所需的列。
好吧,您可以使用查找表來更改單詞;
將熊貓導入為 pd
dict = {
'text':["isn't bad", "can't play"]
}
table = {
"isn't":"is not",
"can't":"cannot"
}
df = pd.DataFrame(dict)
revised_text = []
for text in dict['text']:
words = text.split()
for word in words:
if word in table.keys():
revised_text.append(text.replace(word, table[word]))
df['text1'] = revised_text
print(df)
這是一個選項。
df['text1'] = df['text']
for i in range(len(a)):
df['text1'] = df['text1'].str.replace(a[i],b[i])
這是另一種不涉及迭代的方法。
replacedict = {"isn't":"is not",
"can't":"cannot"}
text = df['text']
df = df.assign(text=df['text'].str.split(' ')).explode('text').replace(replacedict).groupby('id').agg({'text':lambda x: ' '.join(x)}).reset_index()
df['text1'] = df['text']
df['text'] = text
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.