简体   繁体   English

逐行读取 CSV 并将其解组为结构

[英]Read a CSV line by line and unmarshal it to a struct

I have a 8 gigs CSV file that i need to unmarshal to a list of struct我有一个 8 gigs CSV 文件,我需要将其解组为结构列表

package main

import (
    "encoding/csv"
    "fmt"
    "io"
    "os"

    gocsv "github.com/gocarina/gocsv"
    dto "github.com/toto/GeoTransport/import/dto"
)

// Put in parameter json the csv names
func importAdresse() {
    var adressesDB []dto.GeoAdresse
    clientsFile, err := os.OpenFile("../../../data/geo/public.geo_adresse.csv", os.O_RDWR|os.O_CREATE, os.ModePerm)
    if err != nil {
        panic(err)
    }
    gocsv.SetCSVReader(func(in io.Reader) gocsv.CSVReader {
        r := csv.NewReader(in)
        r.Comma = ';'
        return r // Allows use pipe as delimiter
    })
    if err = gocsv.UnmarshalFile(clientsFile, &adressesDB); err != nil { // Load clients from file
        panic(err)
    }
    var i int
    i = 0
    for _, adresse := range adressesDB {
        fmt.Println("adresse.Numero")
        fmt.Printf("%+v\n", adresse)
        fmt.Println(adresse.Numero)
        i++
        if i == 3 {
            break
        }
    }
}

func init() {
}

func main() {
    importAdresse()
}

Actually I am using go csv to unmarshall it but I have some memory error.实际上我正在使用 go csv 来解组它,但我有一些 memory 错误。

The program quit because it does not have enough ram.该程序退出,因为它没有足够的内存。

I would like to know how to read the csv line by line and unmarshal it to a struct.我想知道如何逐行阅读 csv 并将其解组为结构。

One of the solution will be to split the CSV file with some unix command.解决方案之一是使用一些 unix 命令拆分 CSV 文件。

But I would like to know how to do it with only Go.但我想知道如何仅使用 Go 来做到这一点。

It looks like the parsing method you're using attempts to read the entire CSV file into memory .看起来您使用的解析方法试图将整个 CSV 文件读入 memory You might try using the standard CSV reader package directly, or using another CSV-to-struct library that allows for line-by-line decoding like this one .您可以尝试直接使用标准的 CSV 阅读器 package ,或者使用另一个允许像这样逐行解码的 CSV-to-struct 库。 Does the example code on those pages show what you're looking for?这些页面上的示例代码是否显示了您要查找的内容?

Another thing to try would be running wc -l../../../data/geo/public.geo_adresse.csv to get the number of lines in your CSV file, then write this:要尝试的另一件事是运行wc -l../../../data/geo/public.geo_adresse.csv以获取 CSV 文件中的行数,然后写下:

var adressesDB [<number of lines in your CSV>]dto.GeoAdresse

If the runtime raises the out of memory exception on that line, it means that the unmarshalled CSV data exceeds your RAM capacity and you'll have to read it in chunks.如果运行时在该行上引发 out of memory 异常,则意味着未编组的 CSV 数据超出了您的 RAM 容量,您必须分块读取它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM