简体   繁体   中英

read.table from write.table in R

I'm trying to do a qdap::multigsub in order to fix some typos, misspelled names, variant expressions and some other "aberrations" in a list of climatic event types (yes, it's the NOAA's data set on storms that belongs to an assignment in a coursera class on reproducible research; although this fixing is neither required nor expected in the assignment: it's me trying my best!).

So I have events named "flash flood", "flash flooding", "flash floods" and the like, and I'd like to group them all in a level called "flash flood". So what I did first was:

expr <- c("^flash.*floo.*","thun.*")
repl <- c("flash flood","thunderstorm")

Length of each vector is 51 and this is a knitr assignment, so in order to keep it readable (margin column=80), I had to go with something like

expr <- c(expr,"new_expr_1","new_expr_2")
repl <- c(repl,"new_repl_1","new_repl_2") # repeated many, many times

Which makes the code kind of messy. Of course, I have the complete expr and repl vectors, so I would like to have each pair (expr and repl) of correspondent values in a row, so the reader of the code would have an easy time (that's why dput won't work here: they don't align each pair of values).

I tried this:

a <- data.frame(expr=expr,repl=repl)
print(a,rownames=FALSE)
  # copying the output, and then
b <- read.table(header=TRUE,text="paste_text_here")

but it failed (I think because print throws the output without quotation marks and there are some two-word expr or repl). I also tried

write.table(a,rownames=FALSE)
  # copying the output, and then
b <- read.table(header=TRUE,text="paste_text_here")

but it doesn't work either (I think because write.table outputs each item between quotes, and read.table finds too many quotation marks to handle).

I'd like to have in my Rmarkdown file something like this:

exprRepl <- read.table(header=TRUE,text="expr repl
                                         expr_1 repl_1
                                         expr_2 repl_2")

How can I achieve this from the data I have now?

dput of the first 5 rows of data frame follow:

> dput(a[1:5,])
structure(list(expr = structure(c(5L, 1L, 2L, 3L, 4L), .Label = c("^BLIZZARD.*", 
"^FLASH.*FLOOD.*", "^HAIL.*", "^HEAVY.*RAIN.*", "^HURRICANE.*"
), class = "factor"), repl = structure(c(5L, 1L, 2L, 3L, 4L), .Label = c("BLIZZARD", 
"FLASH FLOOD", "HAIL", "HEAVY RAIN", "HURRICANE"), class = "factor")), .Names = c("expr", 
"repl"), row.names = c(NA, 5L), class = "data.frame")

If there's any other approach to replace the wrong/variant names, I'd be very happy to hear from it and give it a try!

One solution is to use a singe quote ' around the pasted text (this works as long as there are no ' in your data):

d <- structure(list(expr = structure(c(5L, 1L, 2L, 3L, 4L), .Label = c("^BLIZZARD.*", 
"^FLASH.*FLOOD.*", "^HAIL.*", "^HEAVY.*RAIN.*", "^HURRICANE.*"
), class = "factor"), repl = structure(c(5L, 1L, 2L, 3L, 4L), .Label = c("BLIZZARD", 
"FLASH FLOOD", "HAIL", "HEAVY RAIN", "HURRICANE"), class = "factor")), .Names = c("expr", 
"repl"), row.names = c(NA, 5L), class = "data.frame")

write.table(d, row.names=FALSE)

# copy paste output of write.table in text field below:
read.table(header = TRUE, text='"expr" "repl"
"^HURRICANE.*" "HURRICANE"
"^BLIZZARD.*" "BLIZZARD"
"^FLASH.*FLOOD.*" "FLASH FLOOD"
"^HAIL.*" "HAIL"
"^HEAVY.*RAIN.*" "HEAVY RAIN"')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM