pandas df 列上的 spacy 詞干不工作

Question

如何在 Pandas Dataframe 列上應用詞干提取

我正在使用這個 function 進行詞干處理，這在字符串上非常有效

xx='kenichan dived times ball managed save 50 rest'

def make_to_base(x):
    x_list = []
    doc = nlp(x)
    for token in doc:
        lemma=str(token.lemma_)
        if lemma=='-PRON-' or lemma=='be':
            lemma=token.text
        x_list.append(lemma)
    print(" ".join(x_list))    
make_to_base(xx)

但是當我在我的 pandas dataframe 列上應用這個 function 時，它既不工作也不給出任何錯誤

x = list(df['text']) #my df column
x = str(x)#converting into string otherwise it is giving error
make_to_base(x)

我嘗試了不同的東西，但沒有任何效果。 像這樣

df["texts"] =  df.text.apply(lambda x: make_to_base(x))

make_to_base(df['text'])

我的數據集如下所示：

df['text'].head()
Out[17]: 
0    Hope you are having a good week. Just checking in
1                              K..give back my thanks.
2          Am also doing in cbe only. But have to pay.
3    complimentary 4 STAR Ibiza Holiday or £10,000 ...
4    okmail: Dear Dave this is your final notice to...
Name: text, dtype: object

Answer 1

您需要實際返回您在make_to_base方法中獲得的值，使用

def make_to_base(x):
    x_list = []
    for token in nlp(x):
        lemma=str(token.lemma_)
        if lemma=='-PRON-' or lemma=='be':
            lemma=token.text
        x_list.append(lemma)
    return " ".join(x_list)

然后，使用

df['texts'] =  df['text'].apply(lambda x: make_to_base(x))

pandas df 列上的 spacy 詞干不工作

問題描述

1 個解決方案

解決方案1
0 已采納 2020-06-15 11:07:26

pandas df 列上的 spacy 詞干不工作

問題描述

1 個解決方案

解決方案1 0 已采納 2020-06-15 11:07:26

解決方案1
0 已采納 2020-06-15 11:07:26