In Python Pandas, I have a data frame with columns and records in the following format:
text source senti
-------------------------------
great food site1 0.6
awful staff site4 -0.4
good chef site8 0.4
average food site6 0.05
bad food site2 -0.8
The text column is essentially a description or opinion of something. I want to draw some conclusions about average sentiment on the sets of data, with the output like this.
sentiment count
----------------
positive 2
neutral 1
negative 2
Where we have a count of 'senti' grouped as positive, negative or neutral.
The sentiments are counted as each group upon meeting the following conditions:
Big thanks in advance
I'd use pd.cut
+ groupby
cut = pd.cut(
df.senti,
[-np.inf, -.1, .1, np.inf],
labels=['positive', 'neutral', 'negative']
)
df.groupby(cut).senti.count().reset_index(name='count')
senti count
0 positive 2
1 neutral 1
2 negative 2
As pointed out by @root, pd.value_counts
gives the same solution on the cut
variable.
pd.value_counts(cut, sort=False).rename_axis('senti').reset_index(name='count')
使用的另一个版本apply
于映射到组:
df.groupby(df['senti'].apply(lambda x: 'negative' if x < -0.1 else 'positive' if x > 0.1 else 'neutral'))['senti'].count()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.