简体   繁体   English

如何从Powershell中的一个.csv文件读取多个数据集

[英]How to read multiple data sets from one .csv file in powershell

I have a temp recorder that (daily) reads multiple sensors and saves the data to a single .csv with a whole bunch of header information before each set of date/time and temperature. 我有一个临时记录器,它(每天)读取多个传感器,并在每组日期/时间和温度之前将数据与一大堆标题信息保存到单个.csv中。 the file looks something like this: 该文件看起来像这样:

"readerinfo","onlylistedonce"
"downloadinfo",YYYY/MM/DD 00:00:00
"timezone",-8
"headerstuff","headersuff"

"sensor1","sensorstuff"
"serial#","0000001"
"about15lines","ofthisstuff"
"header1","header2"
datetime,temp
datetime,temp
datetime,temp

"sensor2","sensorstuff"
"serial#","0000002"
"about15lines","ofthisstuff"
"header1","header2"
datetime,temp
datetime,temp
datetime,temp
"downloadcomplete"

My aim is to pull out the date/time and temp data for each sensor and save it as a new file so that I can run some basic stats(hi/lo/avg temp)on it. 我的目的是为每个传感器提取日期/时间和温度数据,并将其保存为新文件,以便我可以在其上运行一些基本统计信息(高/低/平均温度)。 (It would be beautiful if I could somehow identify which sensor the data came from based on a serial number listed in the header info, but that's less important than separating out the data into sets) The lengths of the date/time lists change from sensor to sensor based on how long they've been recording and the number of sensors changes daily also. (如果我能以某种方式根据标头信息中列出的序列号识别数据来自哪个传感器,那将很漂亮,但这并不比将数据分成几组重要。)日期/时间列表的长度因传感器而异根据传感器已录制了多长时间以及传感器的数量每天也在变化。 Even if I could just split the sensor data, header info and all, into however many files there are sensors, that would be a good start. 即使我可以将传感器数据,标头信息以及所有内容拆分为多个文件(包含传感器),这也是一个不错的开始。

This isn't exactly a CSV file in the traditional sense. 从传统意义上来说,这不是CSV文件。 I imagine you already know this, given your description of the file contents. 鉴于您对文件内容的描述,我想您已经知道这一点。

If the lines with datetime,temp truly do not have any double quotes in them, per your example data, then the following script should work. 根据您的示例数据,如果带有datetime,temp的行中确实没有任何双引号,则以下脚本应该起作用。 This script is self-containing, since it declares the example data in-line. 该脚本是自包含的,因为它可以内联声明示例数据。

IMPORTANT : You will need to modify the line containing the declaration of the $SensorList variable. 重要说明 :您将需要修改包含$SensorList变量的声明的行。 You will have to populate this variable with the names of the sensors, or you can parameterize the script to accept an array of sensor names. 您将必须使用传感器的名称填充此变量,或者您可以参数化脚本以接受传感器名称的数组。

UPDATE : I changed the script to be parameterized. 更新 :我将脚本更改为参数化。

Results 结果

The results of the script are as follows: 该脚本的结果如下:

  1. sensor1.csv (with corresponding data) sensor1.csv(带有相应数据)
  2. sensor2.csv (with corresponding data) sensor2.csv(带有相应数据)
  3. Some green text will be written to the PowerShell host, indicating which sensor is currently detected 一些绿色文本将写入PowerShell主机,指示当前检测到哪个传感器

ScreenshotFiles

截图

Screenshot2

Script 脚本

The contents of the script should appear as follows. 脚本的内容应如下所示。 Save the script file to a folder, such as c:\\test\\test.ps1 , and then execute it. 将脚本文件保存到文件夹,例如c:\\test\\test.ps1 ,然后执行它。

# Declare text as a PowerShell here-string
$Text = @"
"readerinfo","onlylistedonce"
"downloadinfo",YYYY/MM/DD 00:00:00
"timezone",-8
"headerstuff","headersuff"

"sensor1","sensorstuff"
"serial#","0000001"
"about15lines","ofthisstuff"
"header1","header2"
datetime,tempfromsensor1
datetime,tempfromsensor1
datetime,tempfromsensor1

"sensor2","sensorstuff"
"serial#","0000002"
"about15lines","ofthisstuff"
"header1","header2"
datetime,tempfromsensor2
datetime,tempfromsensor2
datetime,tempfromsensor2
"downloadcomplete"
"@.Split("`n");

# Declare the list of sensor names
$SensorList = @('sensor1', 'sensor2');
$CurrentSensor = $null;

# WARNING: Clean up all CSV files in the same directory as the script
Remove-Item -Path $PSScriptRoot\*.csv;

# Iterate over each line in the text file
foreach ($Line in $Text) {
    #region Line matches double quote
    if ($Line -match '"') {
        # Parse the property/value pairs (where double quotes are present)
        if ($Line -match '"(.*?)",("(?<value>.*)"|(?<value>.*))') {
            $Entry = [PSCustomObject]@{
                Property = $matches[1];
                Value = $matches['value'];
            };
            if ($matches[1] -in $SensorList) {
                $CurrentSensor = $matches[1];
                Write-Host -ForegroundColor Green -Object ('Current sensor is: {0}' -f $CurrentSensor);
            }
        }        
    }
    #endregion Line matches double quote
    #region Line does not match double quote
    else {
        # Parse the datetime/temp pairs
        if ($Line -match '(.*?),(.*)') {
            $Entry = [PSCustomObject]@{
                DateTime = $matches[1];
                Temp = $matches[2];
            };
            # Write the sensor's datetime/temp to its file
            Add-Content -Path ('{0}\{1}.csv' -f $PSScriptRoot, $CurrentSensor) -Value $Line;
        }
    }
    #endregion Line does not match double quote
}

Using the data sample you provided, the output of this script would as follows: 使用您提供的数据样本,此脚本的输出如下:

C:\\sensoroutput_20140204.csv C:\\ sensoroutput_20140204.csv

sensor1,datetime,temp
sensor1,datetime,temp
sensor1,datetime,temp
sensor2,datetime,temp
sensor2,datetime,temp
sensor2,datetime,temp

I believe this is what you are looking for. 我相信这就是您要寻找的。 The assumption here is the new line characters. 这里的假设是换行符。 The get-content line is reading the data and breaking it into "sets" by using 2 new line characters as the delimiter to split on. 通过使用2个新行字符作为分隔符, get-content行正在读取数据并将其分为“集合”。 I chose to use the environment's (Windows) new line character. 我选择使用环境(Windows)的换行符。 Your source file may have different new line characters. 您的源文件可能具有不同的换行符。 You can use Notepad++ to see which characters they are eg \\r\\n, \\n, etc. 您可以使用Notepad ++查看它们是哪个字符,例如\\ r \\ n,\\ n等。

$newline = [Environment]::NewLine
$srcfile = "C:\sensordata.log"
$dstpath = 'C:\sensoroutput_{0}.csv' -f (get-date -f 'yyyyMMdd')

# Reads file as a single string with out-string
# then splits with a delimiter of two new line chars
$datasets = get-content $srcfile -delimiter ($newline * 2)

foreach ($ds in $datasets) {
  $lines = ($ds -split $newline)                   # Split dataset into lines
  $setname = $lines[0] -replace '\"(\w+).*', '$1'  # Get the set or sensor name
  $lines | % {
    if ($_ -and $_ -notmatch '"') {                # No empty lines and no lines with quotes
      $data = ($setname, ',', $_ -join '')         # Concats set name, datetime, and temp
      Out-File -filepath $dstpath -inputObject $data -encoding 'ascii' -append
    }
  }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 Powershell 从 csv 文件中仅读取一列 - How to read only one column from csv file using Powershell 在PowerShell中使用多个工作表从.CSV读取数据 - Read data from .CSV with multiple worksheets in PowerShell PowerShell,如何将两组不相关的数据组合成一个数组或csv - PowerShell, How to combine two sets of unrelated data into one array or csv 如何使用Powershell从多个数据库查询数据并将所有内容保存到一个CSV文件中? - How to query data from multiple database and save everything into one CSV file with Powershell? 多个 csv 文件到一个 csv 文件 - Powershell - Multiple csv files to one csv file - Powershell 如何使用PowerShell导入CSV并从文本文件中读取一行? - How to Import a CSV and read a line from text file using PowerShell? 使用Powershell读取Csv文件并捕获相应的数据 - Read a Csv file with powershell and capture corresponding data Powershell 读取 csv 文件,重新格式化为多行,写入 csv 文件 - Powershell read csv file, reformat into multiple lines, write to a csv file Powershell如何将csv文件中的数据添加到一列 - Powershell how to add data in the csv file to one column PowerShell:如何将多个 txt 文件中的数据上传到单个 xlsx 或 csv 文件中 - PowerShell: How to upload data from multiple txt files into a single xlsx or csv file
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM