(I'm not sure if this is an r or shell issue, forgive adding both tags, if you think I should remove one please comment and I'll do so)
I have a amazon hosted version of r at rstudio.example.com. I have written two scripts and they both run fine when I source them from within Rstudio interface.
When I ssh in to my scripts directory and run from there, the scripts generate some errors.
The purpose of the first script is to qdap::check_spelling of a column of text in a data frame, then get the frequency of that spelling error along with an example of the misspelt word:
library(tidyverse)
library(qdap)
# example data
exampledata <- data.frame(
id = 1:5,
text = c("cats dogs dgs cts oranges",
"orngs orngs cats dgs",
"bannanas, dogs",
"cats cts dgs bnnanas",
"ornges fruit")
)
# check for unique misspelt words using qdap
all.misspelts <- check_spelling(exampledata$text) %>% data.frame %>% select(row:not.found)
unique.misspelts <- unique(all.misspelts$not.found)
# for each misspelt word, get the first instance of it appearing for context/example of word in a sentence
contexts.misspellts.index <- lapply(unique.misspelts, function(x) {
filter(all.misspelts, grepl(paste0("\\b",x,"\\b"), not.found))[1, "row"]
}) %>% unlist
# join it all together in a data farem to write to a csv
contexts.misspelts.vector <- exampledata[contexts.misspellts.index, "text"]
freq.misspelts <- table(all.misspelts$not.found) %>% data.frame() %>% mutate(Var1 = as.character(Var1))
misspelts.done <- data.frame(unique.misspelts, contexts.misspelts.vector, stringsAsFactors = F) %>%
left_join(freq.misspelts, by = c("unique.misspelts" = "Var1")) %>% arrange(desc(Freq))
write.csv(x = misspelts.done, file="~/csvs/misspelts.example_data_done.csv", row.names=F, quote=F)
The final data frame looks like:
> print(misspelts.done)
unique.misspelts contexts.misspelts.vector Freq
1 dgs cats dogs dgs cts oranges 3
2 cts cats dogs dgs cts oranges 2
3 orngs orngs orngs cats dgs 2
4 bannanas bannanas, dogs 1
5 bnnanas cats cts dgs bnnanas 1
6 ornges ornges fruit 1
When I run this on my cloud instance of RStudio it runs with no issues and a csv file is generated in the directory specified on the last line of code.
When I run this in linux I get:
myname@ip-10-0-0-38:~$ r myscript.R
ident, sql
During startup - Warning message:
Setting LC_CTYPE failed, using "C"
During startup - Warning message:
Setting LC_CTYPE failed, using "C"
During startup - Warning message:
Setting LC_CTYPE failed, using "C"
During startup - Warning message:
Setting LC_CTYPE failed, using "C"
During startup - Warning message:
Setting LC_CTYPE failed, using "C"
During startup - Warning message:
Setting LC_CTYPE failed, using "C"
During startup - Warning message:
Setting LC_CTYPE failed, using "C"
During startup - Warning message:
Setting LC_CTYPE failed, using "C"
Error in grepl(paste0("\\b", x, "\\b"), not.found) :
object 'not.found' not found
In addition: Warning message:
In data.matrix(data) : NAs introduced by coercion
myname@ip-11-0-0-28:~/rscripts$
Looks like a problem with my grepl()
function. But it works fine when running within Rstudio, just not when calling the script from shell.
But I'm also getting other errors in a separate script based on a dplyry verb (filter).
If anyone recognizes this issue please help! If any more information is required please let me know and I'll add.
PS I tried running the script in my shell locally and it worked. Could this be an issue with my Amazon server?
file in Shell:
shell$ r < input.R > output.CSV
I am not sure if work on R. You can try!
Through trial and error I found that prepending the library name of each function solved this problem eg dplyr::select()
. I don't know why but I wish I understood. This only had to be done when calling the script from ssh r myscript.R.
On every other environment I tested this was not the case, including local terminal, local RStudio instance, hosted RStudio instance - all 3 of those did not need me to prepend the library, only when calling via ssh
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.