如何在使用R的CSV导入中避免引号

Question

I am having problems reading the csv-file below (extract) using R: 我在使用R读取（提取）以下的csv文件时遇到问题：

id,created_date,stars,charity_id,user_id,is_anonymous,user_country_id
"1,""2016-08-10 12:50:30"",100,65536,32772,NULL,110"
"65,""2016-11-09 07:57:32"",50,425986,2686978,1,110"
"66,""2016-11-09 08:07:51"",50,393217,753673,0,110"

df <- read_csv("don.csv", quote = "")

gives me qoutes in cells, which I can process afterwards, but can it not be done more smoothly during importing? 可以在单元格中给我qoutes，我以后可以处理，但是在导入过程中不能更顺利地完成吗？

Answer 1

1) If there are no quotes in the input other than ones we don't want then this would work. 1）如果输入中除了我们不希望的引号之外没有其他引号，则可以使用。 If the input is coming from a file replace textConnection(Lines) with "don.csv" . 如果输入来自文件， textConnection(Lines)替换为"don.csv" 。

L <- readLines(textConnection(Lines))
read.csv(text = gsub('"', '', L))

giving: 赠送：

  id        created_date stars charity_id user_id is_anonymous user_country_id
1  1 2016-08-10 12:50:30   100      65536   32772         NULL             110
2 65 2016-11-09 07:57:32    50     425986 2686978            1             110
3 66 2016-11-09 08:07:51    50     393217  753673            0             110

2) Also assuming that double quotes are all unwanted, another possibility is: 2）假设双引号都是不需要的，另一种可能性是：

read.csv(pipe("sed 's/\042//g' don.csv"))

On Windows you will need to have Rtools installed and on your path for this to work or, if not on your path give the full path, eg "C:\\\\Rtools\\\\bin\\\\sed" . 在Windows上，您需要安装Rtools并在其路径上运行此工具，否则，请提供完整路径，例如"C:\\\\Rtools\\\\bin\\\\sed" 。

Note 注意

The input, Lines is: 输入Lines是：

Lines <-
'id,created_date,stars,charity_id,user_id,is_anonymous,user_country_id
"1,""2016-08-10 12:50:30"",100,65536,32772,NULL,110"
"65,""2016-11-09 07:57:32"",50,425986,2686978,1,110"
"66,""2016-11-09 08:07:51"",50,393217,753673,0,110"'

Answer 2

You can use: 您可以使用：

d <- read.table(sep='"', skip=1, text=
'id,created_date,stars,charity_id,user_id,is_anonymous,user_country_id
"1,""2016-08-10 12:50:30"",100,65536,32772,NULL,110"
"65,""2016-11-09 07:57:32"",50,425986,2686978,1,110"
"66,""2016-11-09 08:07:51"",50,393217,753673,0,110"'
)
d2 <- read.table(text=paste0(d$V2, d$V6), sep=",")
# or d2 <- read.table(text=paste0(d$V2, d$V6), sep=",", na.strings = "NULL")

(For your file you have to use file="don.csv" instead of my text=... .) （对于您的文件，您必须使用file="don.csv"而不是我的text=... ）
The result is 结果是

# d
#   V1  V2 V3                  V4 V5                        V6 V7
# 1 NA  1, NA 2016-08-10 12:50:30 NA ,100,65536,32772,NULL,110 NA
# 2 NA 65, NA 2016-11-09 07:57:32 NA  ,50,425986,2686978,1,110 NA
# 3 NA 66, NA 2016-11-09 08:07:51 NA   ,50,393217,753673,0,110 NA
# d2
#   V1 V2  V3     V4      V5   V6  V7
# 1  1 NA 100  65536   32772 NULL 110
# 2 65 NA  50 425986 2686978    1 110
# 3 66 NA  50 393217  753673    0 110

Eventually you want to rename the columns and bind the columns together with cbind() 最终，您想重命名列并将列与cbind()绑定在一起
The names of the columns you can get with: 您可以使用的列名称：

cnames <- read.table(sep=',', nrows=1, text=
'id,created_date,stars,charity_id,user_id,is_anonymous,user_country_id
"1,""2016-08-10 12:50:30"",100,65536,32772,NULL,110"
"65,""2016-11-09 07:57:32"",50,425986,2686978,1,110"
"66,""2016-11-09 08:07:51"",50,393217,753673,0,110"'
)
as.character(unlist(cnames[1,]))

(For your file you have to use file="don.csv" instead of my text=... .) （对于您的文件，您必须使用file="don.csv"而不是我的text=... ）

The complete code for your file: 文件的完整代码：

cnames <- read.table(sep=',', nrows=1, file="don.csv")
H <- as.character(unlist(cnames[1,]))

d <- read.table(sep='"', skip=1, file="don.csv")
d2 <- read.table(text=paste0(d$V2, d$V6), sep=",", na.strings = "NULL")
d.d2 <- cbind(d[, 4], d2[, -2])
names(d.d2) <- H[c(2, 1, 3:7)]
d.d2

如何在使用R的CSV导入中避免引号

问题描述

2 个解决方案

解决方案1
3 2018-02-07 13:57:12

Note 注意

解决方案2
0 已采纳 2018-02-07 13:34:35

The complete code for your file: 文件的完整代码：

如何在使用R的CSV导入中避免引号

问题描述

2 个解决方案

解决方案1 3 2018-02-07 13:57:12

Note 注意

解决方案2 0 已采纳 2018-02-07 13:34:35

The complete code for your file: 文件的完整代码：

解决方案1
3 2018-02-07 13:57:12

解决方案2
0 已采纳 2018-02-07 13:34:35