简体   繁体   中英

How can I read a double-semicolon-separated .txt in r?

I have this problem but in r:

How can I read a double-semicolon-separated .csv with quoted values using pandas?

The solution there is to drop the additional columns generated. I'd like to know if there's a way to read the file separated by ;; without generating those addiotional columns.

Thanks!

Read it in normally using read.csv2 (or whichever variant you prefer, including read.table , read.delim , readr::read_csv2 , data.table::fread , etc), and then remove the even-numbered columns.

dat <- read.csv2(text = "a;;b;;c;;d\n1;;2;;3;;4")
dat
#   a  X b X.1 c X.2 d
# 1 1 NA 2  NA 3  NA 4

dat[,-seq(2, ncol(dat), by = 2)]
#   a b c d
# 1 1 2 3 4

It is usually recommended to properly clean your data before attempting to parse it, instead of cleaning it WHILE parsing, or worse, AFTER. Either use Notepad++ to Replace all ;; occurences or R itself, but do not delete the original files (also a rule of thumb - never delete sources of data).

my.text <- readLines('d:/tmp/readdelim-r.csv')
cleaned <- gsub(';;', ';', my.text)
writeLines(cleaned, 'd:/tmp/cleaned.csv')
my.cleaned <- read.delim('d:/tmp/cleaned.csv', header=FALSE, sep=';')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM