I have a long character string that looks like this, except where I've shown double back slashes there is, in reality, only one backslash.
char.string <- "BAT\\tUSA\\t\\tmedium\\t0.8872\\t9\\tOff production\\tCal1|Cal2\\r\\nGNAT\\tCAN\\t\\small\\t0.3824\\t11\\tOff production\\tCal3|Cal8|Cal9\\r\\n"
I tried the following.
df <- data.frame(do.call(rbind, strsplit(char.string, "\t", fixed=TRUE)))
df <- ldply (df, data.frame)
The first returns a vector. The second returns thousands of rows and two columns, one consisting of sequential numbers and the second consisting of all the data.
I'm trying to achieve this:
item = c("BAT", "GNAT")
origin = c("USA", "CAN")
size = c("medium", "small")
lot = c("0.8872", "0.3824")
mfgr = c("9", "11")
stat = c("Off production", "Off production")
line = c("Cal1|Cal2", "Cal3|Cal8|Cal9")
df = data.frame(item, origin, size, lot, mfgr, stat, line)
df
item origin size lot mfgr stat line
1 BAT USA medium 0.8872 9 Off production Cal1|Cal2
2 GNAT CAN small 0.3824 11 Off production Cal3|Cal8|Cal9
read.table()
should actually be just fine here, but you have two basic problems:
\\\\small
, but rather small
\\\\t\\\\tmedium
where I think you want just \\\\tmedium
"\\\\t"
is not the same as "\\t"
Try this
# Start with your original input
char.string <- "BAT\\tUSA\\t\\tmedium\\t0.8872\\t9\\tOff production\\tCal1|Cal2\\r\\nGNAT\\tCAN\\t\\small\\t0.3824\\t11\\tOff production\\tCal3|Cal8|Cal9\\r\\n"
# Eliminate the typos
char.string <- sub("\\\\s", "s", char.string)
char.string <- sub("\\\\t\\\\t", "\\\\t", char.string)
# Convert \\t, etc. to actual tabs and newlines
char.string <- gsub("\\\\t", "\t", char.string)
char.string <- gsub("\\\\r", "\r", char.string)
char.string <- gsub("\\\\n", "\n", char.string)
# Read the data into a dataframe
df <- read.table(text = char.string, sep = "\t")
# Add the colnames
colnames(df) <- c("item", "origin", "size", "lot", "mfgr", "stat", "line")
# And take a look at the result
df
item origin size lot mfgr stat line
1 BAT USA medium 0.8872 9 Off production Cal1|Cal2
2 GNAT CAN small 0.3824 11 Off production Cal3|Cal8|Cal9
I took some liberties with what I think are typos in your char.string .
library(tidyverse)
char.string <- "BAT\\tUSA\\tmedium\\t0.8872\\t9\\tOff production\\tCal1|Cal2\\r\\nGNAT\\tCAN\\tsmall\\t0.3824\\t11\\tOff production\\tCal3|Cal8|Cal9\\n"
lapply(
str_split(gsub("\\\\n", "", char.string), "\\\\r")[[1]]
, function(x) {
y <- str_split(x, "\\\\t")[[1]]
data.frame(
item = y[1]
, origin = y[2]
, size = y[3]
, lot = y[4]
, mfgr = y[5]
, stat = y[6]
, line = y[7]
, stringsAsFactors = F
)
}) %>%
bind_rows()
item origin size lot mfgr stat line
1 BAT USA medium 0.8872 9 Off production Cal1|Cal2
2 GNAT CAN small 0.3824 11 Off production Cal3|Cal8|Cal9
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.