简体繁体中英

What is the best way to load a 20 GB csv file into R?

原文 2019-09-05 13:03:01 7 2 r/ bigdata

I have a data set of 20 GB and I have to work with it in R. Now, I have read several articles how to handle this but I have no idea what the best and most efficient way is to read 20 GB of data in R.

Important to mention is that I do not need all the data, so I have to filter/clean the data before I am going to proceed with building my model.

Is it an idea to read the data set into R with Chunks? And what is the best way to read data into Chunks into R?

I hope that someone can help me out.

Kind regards,

Matthijs

2 answers

You could load the data in different parts. Just like you suggest in your comment you could select 10 000 rows and then another 10 000 and so on.

Since you are working with .csv files, I suggest you use the read.csv() function.

Example :

data <- read.csv("file = C:\\Path\\To\\YourFile.csv", nrows = 10000, skip = 10000)

nrows = the number of rows you want R to read.

skip = the number of rows you want R to skip.

The fread function in the data.table package is probably your best bet for an off the shelf function in terms of speed and efficiency. Like was mentioned previously, you can still include the nrows and skip arguments to read the data in pieces.

How to read large (~20 GB) xml file in R?

Load csv file with R

Load csv file in R

What is the best way to read the ith column of a csv file with Python?

Trimming a huge (3.5 GB) csv file to read into R

Reading 40 GB csv file into R using bigmemory

How to deal with a 50GB large csv file in r language?

In R, what is the best way to save this type of list as a permanent file?

What is the best way to import spss file in R with value labels?

How can I read lines from a csv file only if a variable equals to a certain value (20Gb+ csv file)

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question How to read large (~20 GB) xml file in R? Load csv file with R Load csv file in R What is the best way to read the ith column of a csv file with Python? Trimming a huge (3.5 GB) csv file to read into R Reading 40 GB csv file into R using bigmemory How to deal with a 50GB large csv file in r language? In R, what is the best way to save this type of list as a permanent file? What is the best way to import spss file in R with value labels? How can I read lines from a csv file only if a variable equals to a certain value (20Gb+ csv file)

Related Tags

What is the best way to load a 20 GB csv file into R?

Question

2 answers

solution1
2 2019-09-05 13:33:04

solution2
0 2019-09-05 13:43:21

What is the best way to load a 20 GB csv file into R?

Question

2 answers

solution1 2 2019-09-05 13:33:04

solution2 0 2019-09-05 13:43:21

solution1
2 2019-09-05 13:33:04

solution2
0 2019-09-05 13:43:21