简体   繁体   中英

Sentiment analysis using python textblob on a excel file data

在此处输入图像描述 I have a problem where, I need to calculate sentiment analysis of two columns present in the excel file and after calculation of polarity of those two columns, I need to update those polarity values in two other columns which are already present in the same excel input file. Any how I have achieved by calculating polarity of single text sentence. Need suggestions to calculate polarity of entire column present in the excel file. I am using pandas for excel processing.

from textblob import TextBlob
import pandas as pd
Input_file='filepath'
df = pd.read_excel(Input_file, 
sheet_name='Sheet1')
col1 = pd['video_title'].tolist()
# col2 = pd['description'].tolist()
blob = TextBlob(col1)
# blob1 = Texxtblob(col2)
polarity_score = blob.sentiment.polarity
polarity_rounded = round(polarity_score, 6)
print(polarity_rounded)

As i posted in the above image, here i need to replace the values 'None' in the column 'title_sentiment' to the calculated polarity values. Likewise, i have to update the 'description_sentiment' column to the calculated polarity values.

Desired output: 在此处输入图像描述

Let's blackbox your sentiment analysis stuff and reduce your problem to

I have a dataframe with a text column that I want to apply a function to, and the store the result as a new numeric column in the correct row.

Stealing this person's example dataframe with a text column to get started:

In [1]: import pandas as pd 
    ...:  
    ...: df = pd.DataFrame({ 
    ...:     'title': ['foo','bar','baz','baz','foo','bar'], 
    ...:     'contents':[ 
    ...:         'Lorem ipsum dolor sit amet.', 
    ...:         'Lorem ipsum dolor sit amet.', 
    ...:         'Lorem ipsum dolor sit amet.', 
    ...:         'Consectetur adipiscing elit.', 
    ...:         'Lorem ipsum dolor sit amet.', 
    ...:         'Lorem ipsum dolor sit amet.' 
    ...:     ], 
    ...:     'year':[2010,2011,2000,2005,2010,2011] 
    ...: }) 
    ...:  
    ...: df                                                                                                                                                   
Out[1]: 
  title                      contents  year
0   foo   Lorem ipsum dolor sit amet.  2010
1   bar   Lorem ipsum dolor sit amet.  2011
2   baz   Lorem ipsum dolor sit amet.  2000
3   baz  Consectetur adipiscing elit.  2005
4   foo   Lorem ipsum dolor sit amet.  2010
5   bar   Lorem ipsum dolor sit amet.  2011

Now we want to define a function to apply to "contents" and store the result in a new column. For this, we can use pd.Series.apply() :

In [2]: def sentiment_function(text): 
    ...:     # Put all your fancy sentiment stuff here; I will just use `len` as a dummy function. 
    ...:     return len(text) 
    ...:      
    ...: df['sentiment_score'] = df['contents'].apply(sentiment_function) 
    ...: df                                                                                                                                                   
Out[2]: 
  title                      contents  year  sentiment_score
0   foo   Lorem ipsum dolor sit amet.  2010               27
1   bar   Lorem ipsum dolor sit amet.  2011               27
2   baz   Lorem ipsum dolor sit amet.  2000               27
3   baz  Consectetur adipiscing elit.  2005               28
4   foo   Lorem ipsum dolor sit amet.  2010               27
5   bar   Lorem ipsum dolor sit amet.  2011               27

You can do this for your both of your columns, title_sentiment and description_sentiment .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM