How to read a csv but separating only at first two comma separators?

Question

I have a CSV file. I want to read the file in R but use only the first 2 commas ie if there is a line like this in the file,

1,1000,I, am done, with you

In RI want this to the row of a dataframe with three columns like this

> df <- data.frame("Id"="1","Count" ="1000", "Comment" = "I, am done, with you")
> df
  Id Count              Comment
1  1  1000 I, am done, with you

Answer 1

Regular expression will work.

For example, suppose str are the rows you want to recognize. Here suppose your csv file looks like

1,1000,I, am done, with you
2,500, i don't know

If you want to read from file, just call readLines() to read all lines of the file as a character vector in R, just like str .

The technique is very simple. Here I use {stringr} package to match the text and extract the information I need.

str <- c("1,1000,I, am done, with you", "2,500, i don't know")

library(stringr)

# match the strings by pattern integer,integer,anything
matches <- str_match(str,pattern="(\\d+),(\\d+),\\s*(.+)")

Here I briefly explains the pattern (\\\\d+),(\\\\d+),\\\\s*(.+) . \\\\d represents digit character, \\\\s represents space character, . represents anything. + means one or more, * means none or some. () groups the patterns so that the function knows what we regard as a group of information.

If you look at matches , it looks like

     [,1]                          [,2] [,3]   [,4]                  
[1,] "1,1000,I, am done, with you" "1"  "1000" "I, am done, with you"
[2,] "2,500, i don't know"         "2"  "500"  "i don't know"

Look, str_match function successfully split the texts by the pattern to a matrix. Then our work is only to transform the matrix to a data frame with correct data types.

df <- data.frame(matches[,-1],stringsAsFactors=F)
colnames(df) <- c("Id","Count","Comment")
df <- transform(df,Id=as.integer(Id),Count=as.integer(Count))

df is our target:

  Id Count              Comment
1  1  1000 I, am done, with you
2  2  1002         i don't know

How to read a csv but separating only at first two comma separators?

Question

1 answers

solution1
4 2014-02-17 06:14:06

How to read a csv but separating only at first two comma separators?

Question

1 answers

solution1 4 2014-02-17 06:14:06

solution1
4 2014-02-17 06:14:06