简体   繁体   中英

Convert SAS file (sas7bdat) to a flat file using R/Python without memory constraints

I need to convert a SAS file into a flat file. These files can be pretty big that can go up to 60 GB in size. I wrote a script in R (below) but it reads the entire data and then exports to a CSV file. Is there a way I could convert such big files without any memory constraints. I am open to using either R or Python. I working on a machine that has 16 GB RAM.

args = commandArgs(trailingOnly=TRUE)

library(sas7bdat)

MyData <-  read.sas7bdat(file = args[1])
write.csv(MyData, file = args[2], row.names = FALSE)

In my opinion, you can aquire solution using pandas.read_sas and chunksize arg:

Pandas read sas docs

For example, iterate through 10k observations:

import pandas as pd

chunk_size =  10**4
for chunk in pd.read_sas(filename, chunksize=chunksize):
    process(chunk)

where process() are instructions which you want to provide (append, etc.).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM