簡體   English   中英

使用read.table()讀取CSV文件時出錯

[英]Error in reading a CSV file with read.table()

在R中加載CSV數據集時遇到問題。該數據集可以從

https://data.baltimorecity.gov/City-Government/Baltimore-City-Employee-Salaries-FY2015/nsfe-bg53

我使用read.csv如下導入數據,並且數據集已正確導入。

EmpSal <- read.csv('E:/Data/EmpSalaries.csv')

我嘗試使用read.table讀取數據,並且在查看數據集時存在很多異常。

EmpSal1 <- read.table('E:/Data/EmpSalaries.csv',sep=',',header = T,fill = T)

上面的代碼開始從第7行讀取數據,數據集實際上包含約14K行,但僅導入了5K行。 在極少數情況下查看數據集時,會將15-20行合並為單行,而整個行數據將顯示在單列中。

我可以使用read.csv數據集,但我read.csv知道為什么它不適用於read.table的原因。

read.csv定義為:

function (file, header = TRUE, sep = ",", quote = "\"", dec = ".", 
    fill = TRUE, comment.char = "", ...) 
read.table(file = file, header = header, sep = sep, quote = quote, 
    dec = dec, fill = fill, comment.char = comment.char, ...)

您需要添加quote="\\"" (默認情況下, read.table需要單引號,而read.csv需要雙引號)

EmpSal <- read.csv('Baltimore_City_Employee_Salaries_FY2015.csv')
EmpSal1 <- read.table('Baltimore_City_Employee_Salaries_FY2015.csv', sep=',', header = TRUE, fill = TRUE, quote="\"")
identical(EmpSal, EmpSal1)
# TRUE

如前所述,使用read.csv()命令成功導入了數據,而沒有提及quote參數。 read.csv函數的quote參數的默認值為"\\"" ,而read.table函數的默認值為"\\"'" 檢查以下代碼,

read.table(file, header = FALSE, sep = "", quote = "\"'",
           dec = ".", numerals = c("allow.loss", "warn.loss", "no.loss"),
           row.names, col.names, as.is = !stringsAsFactors,
           na.strings = "NA", colClasses = NA, nrows = -1,
           skip = 0, check.names = TRUE, fill = !blank.lines.skip,
           strip.white = FALSE, blank.lines.skip = TRUE,
           comment.char = "#",
           allowEscapes = FALSE, flush = FALSE,
           stringsAsFactors = default.stringsAsFactors(),
           fileEncoding = "", encoding = "unknown", text, skipNul = FALSE)

read.csv(file, header = TRUE, sep = ",", quote = "\"",
         dec = ".", fill = TRUE, comment.char = "", ...)

您指定的數據中有很多單引號。 這就是為什么read.table函數對您不起作用的原因。

嘗試以下代碼,它將為您工作。

 r<-read.table('/home/workspace/Downloads/Baltimore_City_Employee_Salaries_FY2015.csv',sep=",",quote="\"",header=T,fill=T)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM