简体   繁体   English

保存传感器数据的最佳方式,在 Python 中拆分每 x 兆字节

[英]Best Way to Save Sensor Data, Split Every x Megabytes in Python

I'm saving sensor data at 64 samples per second into a csv file.我以每秒 64 个样本的速度将传感器数据保存到一个 csv 文件中。 The file is about 150megs at end of 24 hours.该文件在 24 小时结束时约为 150 兆。 It takes a bit longer than I'd like to process it and I need to do some processing in real time.它比我想要处理它的时间要长一些,我需要实时进行一些处理。

            value = str(milivolts)
            logFile.write(str(datet) + ',' + value + "\n")

So I end up with single lines with date and milivolts up to 150 megs.所以我最终得到了带有日期和高达 150 兆伏特的单行。 At end of 24 hours it makes a new file and starts saving to it.在 24 小时结束时,它会创建一个新文件并开始保存到其中。

  1. I'd like to know if there is a better way to do this.我想知道是否有更好的方法来做到这一点。 I have searched but can't find any good information on a compression to use while saving sensor data.我进行了搜索,但找不到有关在保存传感器数据时使用的压缩的任何有用信息。 Is there a way to compress while streaming / saving?有没有办法在流式传输/保存时进行压缩? What format is best for this?什么格式最适合这个?

  2. While saving the sensor data, is there an easy way to split it into x megabyte files without data gaps?在保存传感器数据时,有没有一种简单的方法可以将其拆分为 x 兆字节的文件而没有数据间隙?

Thanks for any input.感谢您的任何输入。

  1. I'd like to know if there is a better way to do this.我想知道是否有更好的方法来做到这一点。

One of the simplest ways is to use a logging framework, it will allow you to configure what compressor to use (if any), the approximate size of a file and when to rotate logs.最简单的方法之一是使用日志框架,它将允许您配置要使用的压缩器(如果有的话)、文件的大致大小以及何时轮换日志。 You could start with this question .你可以从这个问题开始。 Try experimenting with several different compressors to see if speed/size is OK for your app.尝试使用几种不同的压缩器进行试验,看看速度/大小是否适合您的应用程序。

  1. While saving the sensor data, is there an easy way to split it into x megabyte files without data gaps?在保存传感器数据时,有没有一种简单的方法可以将其拆分为 x 兆字节的文件而没有数据间隙?

A logging framework would do this for you based on the configuration.日志记录框架会根据配置为您执行此操作。 You could combine several different options: have fixed-size logs and rotate at least once a day, for example.您可以组合几个不同的选项:例如,使用固定大小的日志并至少每天轮换一次。

Generally, this is accurate up to the size of a logged line, so if the data is split into lines of reasonable size, this makes life super easy.一般来说,这是精确到记录行的大小,所以如果数据被分成合理大小的行,这会让生活变得非常轻松。 One line ends in one file, another is being written into a new file.一行在一个文件中结束,另一行正在写入一个新文件。

Files also rotate, so you can have order of the data encoded in the file names:文件也会旋转,因此您可以在文件名中编码数据的顺序:

raw_data_<date>.gz
raw_data_<date>.gz.1
raw_data_<date>.gz.2

In the meta code it will look like this:在元代码中,它将如下所示:

# Parse where to save data, should we compress data,
# what's the log pattern, how to rotate logs etc
loadLogConfig(...)

# any compression, rotation, flushing etc happens here
# but we don't care, and just write to file
logger.trace(data)

# on shutdown, save any temporary buffer to the files
logger.flush()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM