I have the following DataFrame:
Date Label Top1 \
0 2008-08-08 0 b"Georgia 'downs two Russian warplanes' as cou...
1 2008-08-11 1 b'Why wont America and Nato help us? If they w...
2 2008-08-12 0 b'Remember that adorable 9-year-old who sang a...
3 2008-08-13 0 b' U.S. refuses Israel weapons to attack Iran:...
4 2008-08-14 1 b'All the experts admit that we should legalis...
Top2 \
0 b'BREAKING: Musharraf to be impeached.'
1 b'Bush puts foot down on Georgian conflict'
2 b"Russia 'ends Georgia operation'"
3 b"When the president ordered to attack Tskhinv...
4 b'War in South Osetia - 89 pictures made by a ...
Top3 \
0 b'Russia Today: Columns of troops roll into So...
1 b"Jewish Georgian minister: Thanks to Israeli ...
2 b'"If we had no sexual harassment we would hav...
3 b' Israel clears troops who killed Reuters cam...
4 b'Swedish wrestler Ara Abrahamian throws away ...
Top4 \
0 b'Russian tanks are moving towards the capital...
1 b'Georgian army flees in disarray as Russians ...
2 b"Al-Qa'eda is losing support in Iraq because ...
3 b'Britain\'s policy of being tough on drugs is...
4 b'Russia exaggerated the death toll in South O...
Top5 \
0 b"Afghan children raped with 'impunity,' U.N. ...
1 b"Olympic opening ceremony fireworks 'faked'"
2 b'Ceasefire in Georgia: Putin Outmaneuvers the...
3 b'Body of 14 year old found in trunk; Latest (...
4 b'Missile That Killed 9 Inside Pakistan May Ha...
Top25 VIX Open VIX High \
0 b"No Help for Mexico's Kidnapping Surge" 21.15 21.69
1 b"So this is what it's come to: trading sex fo... 20.66 20.96
2 b"BBC NEWS | Asia-Pacific | Extinction 'by man... 20.64 21.51
3 b'2006: Nobel laureate Aleksander Solzhenitsyn... 21.57 22.11
4 b'Philippines : Peace Advocate say Muslims nee... 22.30 22.30
the top 1 to top 25 are news articles I want to perform sentiment analysis on every date's articles and create a mean of those scores so is there a way I can efficiently check if a column contains the word Top calculate the score and create a column of mean for every date?
What I tried so far:
def scorer(row, col):
date_scores = []
if col.contains('Top'):
date_scores.append(get_sentiment_score(row[col]))
else:
pass
sentiment_daily_mean = np.mean()
return sentiment_daily_mean
df['date_score'] = df.apply(lambda x: scorer(x), args=list(df.columns))
but this won't work since I'm passing all the columns to the function at once
You need to pass in the rows to the apply-function. Try this:
def scorer(row):
date_scores = []
for col in row:
if 'Top' in col:
date_scores.append(get_sentiment_score(row[col]))
sentiment_daily_mean = date_scores.mean()
return sentiment_daily_mean
df['date_score'] = df.apply(scorer, axis=1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.