iterrows 的矢量化替代方案：语义分析

Question

嗨，我目前正在做语义推文分析，并希望通过 Numpy 矢量化来提高我的代码运行时间。

我尝试增强我的代码一段时间，但没有成功。 我可以在循环迭代中输入公式到 function 并通过 Numpy.vectorize 应用它吗？

ss = SentimentIntensityAnalyzer()

for index, row in tw_list["full_text"].iteritems():
    score = ss.polarity_scores(row)
    neg = score["neg"]
    neu = score["neu"]
    pos = score["pos"]
    comp = score["compound"]
    if neg > pos:
        tw_list.loc[index, "sentiment"] = "negative"
    elif pos > neg:
        tw_list.loc[index, "sentiment"] = "positive"
    else:
        tw_list.loc[index, "sentiment"] = "neutral"
        tw_list.loc[index, "neg"] = neg
        tw_list.loc[index, "neu"] = neu
        tw_list.loc[index, "pos"] = pos
        tw_list.loc[index, "compound"] = comp

Answer 1

您可以使用 apply function，而不是遍历 dataframe 中的行。

def get_sentiments(text):
    score = ss.polarity_scores(text)
    neg = score["neg"]
    neu = score["neu"]
    pos = score["pos"]
    comp = score["compound"]
    if neg > pos:
        sentiment = "negative"
    elif pos > neg:
        sentiment = "positive"
    else:
        sentiment = "neutral"
    return sentiment,neg,neu,pos,comp
    
tw_list[["sentiment","neg","neu","pos","comp"]] = tw_list["full_text"].apply(get_sentiments,result_type='broadcast')

这应该会提高性能

iterrows 的矢量化替代方案：语义分析

问题描述

1 个解决方案

解决方案1
0 2021-05-20 14:35:23

iterrows 的矢量化替代方案：语义分析

问题描述

1 个解决方案

解决方案1 0 2021-05-20 14:35:23

解决方案1
0 2021-05-20 14:35:23