简体   繁体   English

在R中将多个参数作为列表传递

[英]Passing multiple arguments as a list in R

I wish to pass a list of arguments as a vector to another command in R. I do not want to repeat the same set of arguments every time. 我希望将参数列表作为向量传递给R中的另一个命令。我不想每次都重复相同的参数集。

This is the code that I have to run 6 times for each $full_text column of data frames ranging t1 to t6 . 这是我必须为t1 to t6范围的数据帧的每个$full_text列运行6次的代码。

    library(quanteda)

t1t <- tokens(t1$full_text, what = 'word', remove_numbers = TRUE,
                 remove_punct = TRUE,
                 remove_symbols = TRUE,
                 remove_separators = TRUE,
                 remove_twitter = TRUE,
                 remove_hyphens = TRUE,
                 remove_url = TRUE)
t1t <- tokens_tolower(t1t)
t1t <- tokens_select(t1t, stopwords(), selection = "remove")
t1t <- unlist(t1t)
t1t <- unique(t1t)
t1t <- as.data.frame(t1t)
t1t <- as.data.frame.matrix(t1t)

Is there a way to pass a one-time argument. 有没有办法传递一次性论点。

As mentioned in the error message tokens expect character vector, corpus or tokens as input. 如错误消息中所述, tokens字符向量,语料库或令牌作为输入。 You are passing a dataframe to it. 您正在向其传递数据帧。 Pass the respective column of text to it instead. 而是将相应的文本列传递给它。

Also tokens can process vectors so you can pass multiple columns together as one vector. tokens也可以处理向量,因此您可以将多个列一起作为一个向量传递。

library(quanteda)

tokens(c(t1$colname, t2$colname, t3$colname), what = "word", remove_numbers = TRUE, 
  remove_punct = TRUE, remove_symbols = TRUE, remove_separators = TRUE, 
  remove_twitter = TRUE, remove_hyphens  =TRUE, remove_url = TRUE)

Based on the update and taking an example from the help page of ?tokens 基于更新并以?tokens帮助页面为例

t1 <- data.frame(full_text = "#textanalysis is MY <3 4U @myhandle gr8 #stuff :-)", 
              stringsAsFactors = FALSE)
t2 <- data.frame(full_text = c("This is $10 in 999 different ways,\n up and down; 
    left and right!", "@kenbenoit working: on #quanteda 2day\t4ever, 
    http://textasdata.com?page=123."), stringsAsFactors = FALSE)

We can create a function to apply it to all dataframes 我们可以创建一个函数以将其应用于所有数据框

 complete_function <- function(x) {
   t1t <- tokens(x, what = 'word', remove_numbers = TRUE,
                  remove_punct = TRUE,
                  remove_symbols = TRUE,
                  remove_separators = TRUE,
                  remove_twitter = TRUE,
                  remove_hyphens = TRUE,
                  remove_url = TRUE)
   t1t <- tokens_tolower(t1t)
   t1t <- tokens_select(t1t, stopwords(), selection = "remove")
   t1t <- unlist(t1t)
   t1t <- unique(t1t)
   t1t <- as.data.frame(t1t)
   t1t <- as.data.frame.matrix(t1t)
}

Then use mget to get dataframes t1 , t2 , t3 etc and apply the function to "full_text" column of each dataframe. 然后使用mget获取数据帧t1t2t3等,并将该函数应用于每个数据帧的"full_text"列。

lapply(mget(ls(pattern = "^t\\d+")), function(x) complete_function(x$full_text))

#$t1
#           t1t
#1 textanalysis
#2           4u
#3     myhandle
#4          gr8
#5        stuff

#$t2
#        t1t
#1 different
#2      ways
#3      left
#4     right
#5 kenbenoit
#6   working
#7  quanteda
#8      2day
#9     4ever

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM