简体   繁体   中英

Read files that are pipe AND comma delimited: |column1|,|column2|

I have files that look like this:

|2000|,|23456745|,|23567897tyhgy6|,|SHARP, RODNEY H III|
|2000|,|12345678|,|34567tgh788877|,|WOOLARD, EDGAR S JR|

Basically, the columns are separated by commas and wrapped by pipes.

How do I read something like this using R?

I have tried

read.table("file.txt", sep="|")

but this doesn't work well, since every other column just contains a comma. I have tried using "|,|" as the separator, but apparently this is not allowed. Using "," doesn't work at all since the names then get split up.

Any easy way to do this?

You can just try to replace it with other seperator:

plouf <-   readChar("file.txt", file.info("file.txt")$size)
plouf <- gsub("\\|,\\|",";",plouf) # replace the separator
plouf <- gsub("\\|","",plouf) # remove the end pipes
read.table(plouf,sep=";") # read with the semi colon sep

A test:

plouf <- "|2000|,|23456745|,|23567897tyhgy6|,|SHARP, RODNEY H III|
          |2000|,|12345678|,|34567tgh788877|,|WOOLARD, EDGAR S JR|"

plouf <- gsub("\\|,\\|",";",plouf)
plouf <- gsub("\\|","",plouf)
read.table(text = plouf,sep=";")

    V1       V2             V3                  V4
1 2000 23456745 23567897tyhgy6 SHARP, RODNEY H III
2 2000 12345678 34567tgh788877 WOOLARD, EDGAR S JR

read.table("./temp.csv", sep=",", quote = "|")可以解决问题...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM