[英]Count occurrences of strings in a dataframe
通过R
,我可以轻松地创建一个包含来自字符串列表的某些字符串模式的频率的数据帧。
library(stringr)
library(tm)
library(dplyr)
text = c('i am so hhappy happy now','you look ssad','sad day today','noway')
dat = sapply(c('happy', 'sad'), function(i) str_count(text, i))
dat = data.frame(dat)
dat = dat %>% mutate(Sentiment = (happy)-(sad))
结果,我可以有一个这样的数据框
happy sad Sentiment
1 2 0 2
2 0 1 -1
3 0 1 -1
4 0 0 0
在Python中,我可以假设除sapply()
之外的其余代码
import pandas as pd
text = ['i am so hhappy happy now','you look ssad','sad day today','noway']
????
dat = pd.DataFrame(dat)
dat['Sentiment'] = dat.apply(lambda c: c.happy - c.sad)
什么会????
是?
您可以使用pd.Series.str.count
:
import pandas as pd
import numpy as np
text = ['i am so hhappy happy now','you look ssad','sad day today','noway']
df = pd.DataFrame({'text' : text})
df['happy'] = df.text.str.count('happy')
df['sad'] = df.text.str.count('sad')
df['Sentiment'] = df.happy - df.sad
df
text happy sad Sentiment
0 i am so happy happy now 2 0 2
1 you look sad 0 1 -1
2 sad day today 0 1 -1
3 noway 0 0 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.