Error Reading a CSV File in R

Question

I am trying to read a bunch of files from http://www.ercot.com/gridinfo/load/load_hist , all the files are read properly with read.csv except for the last one, the file for 2017. When I attempt to read the file with read.csv I get the following error:

Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : scan() expected 'a real', got '"8'

However, I have checked with Excel and there is not "8 or 8 value in the file. The error message seems to be clear, but I can't find the "8 or 8 and I have the same issue even if I read 0 rows (with the nrows argument of the read.csv function).

 hold2  <- read.csv(paste(PATH, "\\CSV\\", "native_load_2017.csv", sep=""), header=TRUE, sep=",", dec = ".", colClasses=c("character",rep("double",9)))

hold2  <- read.csv(paste(PATH, "\\CSV\\", "native_load_2017.csv", sep=""), header=TRUE, sep=",", dec = ".", colClasses=c("character",rep("double",9)), nrows=0)

Also, in the last row of the file there are values that do not respect the format in the rest of the file. I would like to skip the last line, but there are no argument in the read.csv function to do this. Is there any work around? I am thinking or using something like:

hold2  <- read.csv(paste(PATH, "\\CSV\\", "native_load_2017.csv", sep=""), header=TRUE, sep=",", dec = ".", colClasses=c("character",rep("double",9)), nrows=nrow(read.csv(paste(PATH, "\\CSV\\", "native_load_2017.csv", sep=""))-1))

Any thoughts on how to best to this? Thanks

Answer 1

Using the readr package

> df <- readr::read_csv("~/Desktop/native_load_2017.csv")
Parsed with column specification: 
cols(   
`Hour Ending` = col_character(),
 COAST = col_number(),
 EAST = col_number(),
 FWEST = col_number(),
 NORTH = col_number(),
 NCENT = col_number(),
 SOUTH = col_number(),
 SCENT = col_character(),
 WEST = col_number(),
 ERCOT = col_number()
)
>

can see the SCENT column is being parsed as character (due to the difference in format of values in the last row that you noted). Below, specifying the first column as character and the default as col_number() reads the file (to note: col_number() handles the commas and decimal points present in the columns you had as double).

options(digits=7)
df <- readr::read_csv("~/Desktop/native_load_2017.csv", col_types = cols(
  `Hour Ending` = col_character(),
  .default = col_number())
)
sapply(df, class) 
#df[complete.cases(df),] # to remove the last row if needed

Error Reading a CSV File in R

Question

1 answers

solution1
0 ACCPTED 2017-02-13 01:30:13

Error Reading a CSV File in R

Question

1 answers

solution1 0 ACCPTED 2017-02-13 01:30:13

solution1
0 ACCPTED 2017-02-13 01:30:13