简体   繁体   English

在 python 中随时间绘制情绪分析

[英]Plotting sentiment analysis over time in python

I am trying to plot the results of my sentiment analysis over time.我正在尝试 plot 随着时间的推移我的情绪分析结果。 The code involves comments from a forum.该代码涉及来自论坛的评论。 An example of my code looks something like this:我的代码示例如下所示:

Timestamp            Sentiment
2021-01-28 21:37:41  Positive
2021-01-28 21:32:10  Negative
2021-01-29 21:30:35  Positive
2021-01-29 21:28:57  Neutral
2021-01-29 21:26:56  Negative

I would like to plot a line graph with just the date from the timestamp on the x-axis, and then a separate line for the value counts of the "sentiment" column.我想 plot 一个折线图,其中只有 x 轴上时间戳的日期,然后是“情绪”列的值计数的单独行。 So 3 lines total, one for each of the sentiments (positive, negative and neutral) with the y axis representing the count.所以总共有 3 行,每个情绪(正面、负面和中性)各有一条,y 轴代表计数。 I think I need to somehow use groupby() but I cannot figure out how.我想我需要以某种方式使用 groupby() 但我不知道如何。

My solution is a bit convoluted, and you should probably enhance the graph later to fit what you want (like a stacked bar).我的解决方案有点复杂,您可能应该稍后增强图表以适应您想要的(如堆叠条)。

First, let's separate your dataframe timestamp into the dates.首先,让我们将您的 dataframe 时间戳分成日期。

import pandas as pd
import matplotlib.pyplot as plt
example = {'Timestamp':
          ['2021-01-28 21:37:41', '2021-01-28 21:32:10', '2021-01-29 21:30:35',
           '2021-01-29 21:28:57', '2021-01-29 21:26:56'],
           'Sentiment':
           ['Positive', 'Negative', 'Positive', 'Neutral', 'Negative']}
df = pd.DataFrame(example)
df['Timestamp'] = pd.to_datetime(df['Timestamp'])
df['Date'] = df['Timestamp'].dt.date

Then, let's groupby the date, and count the unique values.然后,让我们按日期分组,并计算唯一值。

grouped = df.groupby(by='Date')['Sentiment'].value_counts()

Output: Output:

Date        Sentiment
2021-01-28  Negative     1
            Positive     1
2021-01-29  Negative     1
            Neutral      1
            Positive     1
Name: Sentiment, dtype: int64

This is a multi index series.这是一个多指标系列。 To get it in a better format, we can unstack the multi index.为了获得更好的格式,我们可以取消堆叠多索引。

unstacked = grouped.unstack(level=1)

Then, we can plot on the object directly, unstacked.plot.bar() .然后,我们可以直接将 plot 放在 object 上,unstacked.plot.bar unstacked.plot.bar() This is the result.这就是结果。

输出

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM