I have a data frame with various stock information that I have used to create a positive, negative, or not determined sentiment with respect to a company name. The head of this data are:
head(companyReturnsNameScore)
#----------
PERMNO date EXCHCD SICCD TICKER PRC VOL RET SHROUT companyNameSentiment companyName
1 85814 19980831 3 5960 CTAC 6.1875 27989 -0.489691 6431 Not Determined 1 800 CONTACTS INC
2 85814 20021231 3 5960 CTAC 27.5700 97498 1.177725 11388 Not Determined 1 800 CONTACTS INC
3 85814 19990129 3 5960 CTAC 14.7500 5658 -0.180556 6275 Not Determined 1 800 CONTACTS INC
4 85814 20021031 3 5960 CTAC 9.0300 20192 -0.097000 11382 Not Determined 1 800 CONTACTS INC
5 85814 20021129 3 5960 CTAC 12.6600 15474 0.401993 12082 Not Determined 1 800 CONTACTS INC
6 85814 20070731 3 5961 CTAC 23.2400 5574 -0.009378 13619 Not Determined 1 800 CONTACTS INC
marketCap marketCapDeclile
1 39791.81 2
2 313967.16 6
3 92556.25 4
4 102779.46 4
5 152958.12 5
6 316505.56 6
I am trying to perform statistical analysis by decile ranking of market cap (marketCapDecile), but within each decile rank, I want to further perform a by analysis for each sentiment factor. That means that for each decile rank, I want to see statistical output for each factor level of "positive, negative, not determined." When I enter what I think is the correct command for a list of factors,
by( companyReturnsNameScore$RET, c(companyReturnsNameScore$marketCapDeclile,
companyReturnsNameScore$companyNameSentiment), summary)
I unfortunately get the following error:
Error in tapply(seq_len(1785812L), list(`c(companyReturnsNameScore$marketCapDeclile, companyReturnsNameScore$companyNameSentiment)` = c(2L,
: arguments must have same length
I have 10 factor levels for the market cap decile, and three for the sentiment factor classification, so essentially, I want 30 analyses performed... Problem is, I am having difficulty performing that factor within factor analysis.
What am I doing incorrectly? How can I perform a factor within factor analysis?
You second argument concatenates two vectors, making them twice as long as the first argument:
length( c( factor(1:5), factor(6:10) ) )
[1] 10
You have (at least) two choices: either use a list
(noting that the help function for ?by
says to use a list, or use the interaction
function which returns a single vector of the length of the longest input:
# 1
by( companyReturnsNameScore$RET,
list( companyReturnsNameScore$marketCapDeclile,
companyReturnsNameScore$companyNameSentiment),
summary)
# 2
by( companyReturnsNameScore$RET,
interaction( companyReturnsNameScore$marketCapDeclile,
companyReturnsNameScore$companyNameSentiment),
summary)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.