简体   繁体   中英

Unused Argument error in R using tm for word frequency matrix?

I'm new to programming and R. I'm trying to use the wordfish function in the Austin package. I created a term document matrix from a corpus but cannot successfully use the wordfish command:

    library(tm)
    library(austin)
    text.corpus.format<-VCorpus(DirSource("MyDirectory"))

#create Word Frequency Matrix
    wordfreqmatrix<-TermDocumentMatrix(text.corpus.format)
    wcdata<-as.matrix(wordfreqmatrix) # CONVERT WORD COUNT MATRIX FOR USE WITH WORDFISH
    wcdata<-t(wcdata) # TRANSPOSE TERM DOC MATRIX
    as.matrix(as.data.frame(wcdata)) # ASSIGN DOC TITLES TO MATRIX
    rownames(wcdata)<-lapply(text.corpus.format,Author)

#problematic command results: 
wordfish(input=wcdata,dir=c(221,223))
    Error in wordfish(input = wcdata, dir = c(221, 223)) : 
    unused argument (input = wcdata)

The correct usage for the wordfish function is wordfish(wfm,dir=c(1,10)). I thought I defined wcdata as a word frequency matrix, but I must have done something wrong. Any insight is greatly appreciated!

The problem is the that there is a difference between different implementations of wordfish . As listed at http://www.wordfish.org/software.html , there is the "original" version and a version implemented in the AUSTIN package. The "original" version has a parameter named input= , however, the AUSTIN implementation uses a parameter named wfm= .

If you didn't name your parameter, and just left it as the first thing you passed to the function, it would have worked as well because those arguments are positional as well. But once you name them, you disrupt the positional order and the name takes precedence.

So either take off the name, or use the correct name for the AUSTIN package ( input= )

Also the package is looking for particular names on the object passed in. You can ensure you are passing a wfm object by running your data through the wfm function. I'm not sure what the 'dir' parameter is for but I had to set it as well to get this minimal example running.

docs <- c(D1 = "look at all the words in the document", 
    D2 = "i hope this document has more words than the other document")
text.corpus.format <- Corpus(VectorSource(docs))

wordfreqmatrix <- TermDocumentMatrix(text.corpus.format)
wcdata <- wfm(as.matrix(wordfreqmatrix))

wordfish(wcdata, dir=c(1,2))

# Call:
#   wordfish(wfm = wfm(wcdata), dir = c(1, 2))
# 
# Document Positions:
#   Estimate Std. Error    Lower    Upper
# 1  -1.0378     0.4832 -1.98476 -0.09078
# 2   0.8763     0.4322  0.02917  1.72351

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM