简体   繁体   English

CSV 压缩后填充更改为字节数组

[英]CSV Fille is altered after compression as byte array

so i receive a CSV file from front end and i need to compress it and store it into MySql DB.所以我从前端收到一个 CSV 文件,我需要对其进行压缩并将其存储到 MySql 数据库中。 The problem is that after i decompress the file back it is altered and is not having a CSV structure anymore (I compress using ZIP but I have also tried gzip).问题是,在我解压缩文件后,它被更改并且不再具有 CSV 结构(我使用 ZIP 进行压缩,但我也尝试过 gzip)。 For example this is the file before compression例如这是压缩前的文件

header1,header2,header3,header4,header5
val,val2,val3,val4,val5
val6,val7,val8,val9,val10

And this is the file after decompression这是解压后的文件

header1,header2,header3,header4,header5val1,val2,val3,val4,val5val6,val7,val8.val9.val10

I need to send the decompressed file to a python time series analysis service and it cannot parse it properly.我需要将解压文件发送到 python 时间序列分析服务,它无法正确解析。

I compress/decompress the file directly as a byteArray and i am sure the compression is the problem because i have tried to store and fetch an uncompressed CSV and it works fine.我直接将文件压缩/解压缩为 byteArray,我确信压缩是问题所在,因为我试图存储和获取未压缩的 CSV 并且它工作正常。 Thanks in advance!提前致谢!

Here is the code used for compression这是用于压缩的代码

@Throws(Exception::class)
fun compressFile(file : ByteArray) : ByteArray {
   val baos = ByteArrayOutputStream()
   val zos = ZipOutputStream(baos)
   val entry = ZipEntry("data.csv")
   entry.size = file.size.toLong()
   zos.putNextEntry(entry)
   zos.write(file)
   zos.closeEntry()
   zos.close()
   return baos.toByteArray()
}

And here is the code used for decompression这是用于解压的代码

@Throws(Exception::class)
fun decompressFile(file : ByteArray): ByteArray {
   if (file.isEmpty()) return file
   val gis = ZipInputStream(ByteArrayInputStream(file))
   gis.nextEntry
   val bf = BufferedReader(InputStreamReader(gis, "UTF-8"))
   var outStr = ""
   var line: String
   while (bf.readLine().also { line = it ?: "" } != null) {
       outStr += line
   }
   gis.close()
   bf.close()
   return outStr.toByteArray()
}

I think, you lose your NL character, because BufferedReader.readLine() reads ONLY the line without the new line at the end.我认为,您丢失了 NL 字符,因为 BufferedReader.readLine() 仅读取末尾没有新行的行。 That is, you concat Line1Line2, but skips the new line between them.也就是说,您连接 Line1Line2,但跳过它们之间的新行。

You should not read the stream through bufferedreader, you need to read the ENTIRE content, including the new line characters, see https://www.baeldung.com/convert-input-stream-to-string您不应该通过 bufferedreader 读取 stream,您需要读取整个内容,包括换行符,请参阅https://www.baeldung.com/convert-input-stream-to-string

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM