如何遍歷數據框並對文本進行正或負分類？

Question

我目前有一個包含標記化鳴叫的pandas數據框。

我需要能夠瀏覽每條推文，並確定它是肯定的還是否定的，以便允許我在隨后的欄中添加包含肯定或否定詞的內容。

示例數據：

tokenized_tweets =  ['football, was, good, we, played, well' , 'We, were, unlucky, today, bad, luck' , 'terrible, performance, bad, game']

我需要能夠在tokenized_tweets節中運行一個循環，弄清楚它是正還是負。

對於示例，正詞和負詞如下：

Positive_words = ['good', 'great'] 
Negative_words = ['terrible, 'bad']

期望的輸出是一個數據消息，其中包含該推文，每個推文包含多少個正字母，每個推文包含多少個負字母以及該推文是正，負還是中性。

需要根據推文具有更多正面還是負面流行語來確定正面負面和中立態度

所需的輸出：

Tokenized tweet                    positive words       negative words         overall 
`football, was, good, we, played, well         1                0            positive` 

We, were, unlucky, today, bad, luck            0                1            negative
terrible, performance, bad, game               0                2            negative

Answer 1

import pandas as pd
import numpy as np

df = pd.DataFrame({'tokenized_tweets': ['football, was, good, we, played, well', 'We, were, unlucky, today, bad, luck','terrible, performance, bad, game']})

Positive_words = ['good', 'great'] 
Negative_words = ['terrible','bad']

df['positive words'] = df['tokenized_tweets'].str.count('|'.join(Positive_words))
df['negative words'] = df['tokenized_tweets'].str.count('|'.join(Negative_words))

conditions = [
(df['positive words'] > df['negative words']),
(df['negative words'] > df['positive words']),
(df['negative words'] == df['positive words'])
]

choices = [
'positive',
'negative',
'neutral'
]

df['overall'] = np.select(conditions, choices, default = '')

df

出：

tokenized_tweets                      positive words   negative words   overall
0   football, was, good, we, played, well   1               0        positive
1   We, were, unlucky, today, bad, luck     0               1        negative
2   terrible, performance, bad, game        0               2        negative

如何遍歷數據框並對文本進行正或負分類？

問題描述

1 個解決方案

解決方案1
0 2018-04-18 20:41:20

如何遍歷數據框並對文本進行正或負分類？

問題描述

1 個解決方案

解決方案1 0 2018-04-18 20:41:20

解決方案1
0 2018-04-18 20:41:20