简体   繁体   English

Python:在熊猫系列中连接以逗号分隔的字符串

[英]Python: Concatenate string separated by comma in pandas series

After using a TextBlob spell corrector, the sentence in each row becomes separated by comma.使用 TextBlob 拼写校正器后,每行中的句子用逗号分隔。

from textblob import TextBlob
list = df['sentence'].tolist()

def TBSpellCorrector(sentence):
    b = TextBlob(sentence)
    return b.correct()

df['corrected_sentence']=df['sentence'].apply(TBSpellCorrector)

Result:结果:

    sentence         corrected_sentence
132 on fre     (o, n,, f, i, r, e)             
35  beautful    (b, e, a, u, t, i, f, u, l)    

I need to concatenate the sentence separated by comma.我需要连接以逗号分隔的句子。

Expected Output
    sentence         corrected_sentence        corrected_sentence2
132 on fre           (o, n,, f, i, r, e)             on fire
35  beautful    (b, e, a, u, t, i, f, u, l)         beautiful

if the correct_sentence is in a list form, you can join them with join如果correct_sentence为以列表的形式,你可以加入他们join

>>> sent = ['o','n',' ','f','i','r','e']
>>> ''.join(sent)
'on fire'

The .correct() method returns a textblob.blob.TextBlob object. .correct()方法返回一个textblob.blob.TextBlob对象。 You just need to cast it to a string, or access its .string property:你只需要将它转换为一个字符串,或者访问它的.string属性:

from textblob import TextBlob
import pandas as pd

def TBSpellCorrector(sentence):
    return TextBlob(sentence).correct().string # <<< See here

df = pd.DataFrame({'sentence':['on fre','beautful']})
df['sentence'].apply(TBSpellCorrector)
# 0       on are
# 1    beautiful
# Name: sentence, dtype: object

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM