繁体   English   中英

Python:在熊猫系列中连接以逗号分隔的字符串

[英]Python: Concatenate string separated by comma in pandas series

使用 TextBlob 拼写校正器后,每行中的句子用逗号分隔。

from textblob import TextBlob
list = df['sentence'].tolist()

def TBSpellCorrector(sentence):
    b = TextBlob(sentence)
    return b.correct()

df['corrected_sentence']=df['sentence'].apply(TBSpellCorrector)

结果:

    sentence         corrected_sentence
132 on fre     (o, n,, f, i, r, e)             
35  beautful    (b, e, a, u, t, i, f, u, l)    

我需要连接以逗号分隔的句子。

Expected Output
    sentence         corrected_sentence        corrected_sentence2
132 on fre           (o, n,, f, i, r, e)             on fire
35  beautful    (b, e, a, u, t, i, f, u, l)         beautiful

如果correct_sentence为以列表的形式,你可以加入他们join

>>> sent = ['o','n',' ','f','i','r','e']
>>> ''.join(sent)
'on fire'

.correct()方法返回一个textblob.blob.TextBlob对象。 你只需要将它转换为一个字符串,或者访问它的.string属性:

from textblob import TextBlob
import pandas as pd

def TBSpellCorrector(sentence):
    return TextBlob(sentence).correct().string # <<< See here

df = pd.DataFrame({'sentence':['on fre','beautful']})
df['sentence'].apply(TBSpellCorrector)
# 0       on are
# 1    beautiful
# Name: sentence, dtype: object

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM