简体   繁体   中英

Reading specific rows from csv file into R

I have a large csv file that I need to read into R. However, I only need observations with specific variable values (ie with certain dates). Is there a way you can do that from the onset without the need to read the entire file and then subsetting?

Assuming the dates are in the first column of your data set (and you are on a Unix-like machine), you could do something like this:

dates <- paste0(c("2015-06-01", "2015-06-16"), collapse = "|")
expr <- paste0("grep -E '(", dates, "),.+' tmpcsv.csv", collapse = "")
##
R> data.table::fread(expr)
           V1         V2
1: 2015-06-16 -1.6866933
2: 2015-06-16  1.3686023
3: 2015-06-01 -0.2257710
4: 2015-06-16 -1.0185754
5: 2015-06-01  0.3035286
6: 2015-06-01  2.0500847
7: 2015-06-01 -0.4910312

If not, you will have to modify the regular expression accordingly.


Data:

set.seed(123)
##
df <- data.frame(
  Date = Sys.Date() + floor(50*round(runif(50, -1, 1), 1)),
  Value = rnorm(50)
)
write.csv(df, file = "tmpcsv.csv", row.names = FALSE)
##

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM