简体   繁体   中英

F# read zipped csv file

is that possible to use F# deedle to read zipped csv directly like the read_csv function in pandas ? if this is not possible, is that possible to use csv type provider to do this ?

If you use the ICSharpCode.SharpZipLib NuGet package, you can read the CSV from the zip with Deedle like this:

open ICSharpCode.SharpZipLib.Zip
open System.IO
open Deedle

[<EntryPoint>]
let main argv = 
    use fs = new FileStream(@"mycsv.zip", FileMode.Open, FileAccess.Read)
    use zip = new ZipFile(fs)
    use csv = zip.GetInputStream(0L)
    let frame = Frame.ReadCsv(csv)

Why do you need to read the zipfile csv directly? You can always access the file(s) with System.IO.Compression and then feed it to Deedle or the CSVProvider or even FileHelper:

open System.IO.Compression  
open System.IO

let zipfile =  @"C:\tmp\zipFile1.zip"

let unzip (zipfile:string) =
    let zipf = new FileStream(zipfile,FileMode.Open,FileAccess.Read)
    let zip  = new ZipArchive(zipf)
    zip

let unzipFile = unzip zipfile
let stream = new StreamReader(unzipFile.GetEntry("zipFile1.csv").Open())  
let txt = stream.ReadToEnd()

If your input can take a stream (like the above libraries), then this utility function will do it (using OpenRead directly on the zipfile):

//string * string -> StreamReader
let getFromZip(entry,zip) =
    ZipFile.OpenRead(zip)
        |> (fun x -> x.GetEntry(entry))
        |> (fun x -> new StreamReader(x.Open()))

You might also need to reference System.IO.Compression.FileSystem , but no need to open it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM