简体   繁体   English

将日志数据转换为具有所需格式的csv文件

[英]Converting log data into csv file with a desirable format


I have a log data as: 我有一个日志数据为:

Name:Mark
City:London
Country:UK

Name:Ben
City:Paris
Country:France

Name:Tom
City:Athens
Country:Greece

And I need to make a CSV output with the format as: 我需要以以下格式进行CSV输出:

Name   City      Country
Mark   London    UK
Ben    Paris     France
Tom    Athens    Greece

The Batch that I have created for this is the simple one to convert to CSV. 我为此创建的批处理很简单,可以转换为CSV。 Which is as follows: 如下:

@echo off

cd /d %~dp0
set infilenm=abc.log
set outfilenm=abc.csv
set beforestr=
set afterstr=, 

type nul >%outfilenm%

setlocal enabledelayedexpansion

for /f "tokens=1,2,3 delims=" %%A in (%infilenm%) do (   
    set line=%%A      
    echo !line:%beforestr%=%afterstr%!>>%outfilenm%
)
endlocal

exit /b

As I am a very new for batch script, can any body help me out for this! 由于我是批处理脚本的新手,因此任何机构都可以帮助我!

You have got a wrong logic in your script; 您的脚本逻辑错误; for /F reads one line after another, so you have to collect the data of three lines before writing one output line. for /F会逐行读取一行,所以您必须先收集三行的数据,然后再写入一行输出。

Here is an example of how to accomplish your task, not using for /F but input redirection ( < ) and set /P to read the log file: 这是一个如何完成您的任务的示例,而不是for /F而是输入重定向( <并将set /P为读取日志文件:

@echo off
setlocal EnableDelayedExpansion
for /F %%C in ('^< "abc.log" find /C /V ""') do set /A "COUNT=(%%C+1)/2"
set "FIRST=#"
< "abc.log" > "abc.csv" (
    for /L %%I in (1,1,%COUNT%) do (
        set "LINE1=" & set /P LINE1=""
        if defined LINE1 (
            set "LINE2=" & set /P LINE2=""
            set "LINE3=" & set /P LINE3=""
            if defined FIRST (
                echo Name,City,Country
                set "FIRST="
            )
            echo(!LINE1:*:=!,!LINE2:*:=!,!LINE3:*:=!
        )
    )
)
endlocal

This relies on the shown format of your log file, so it does not verify the strings left to the colons. 这取决于您的日志文件的所示格式,因此它不会验证留给冒号的字符串。


Here is a more flexible approach, which is based on the above one, but it collects the field values by their names which are held in a predefined configurable list (constant _LIST ). 这是基于上述方法的一种更灵活的方法,但是它通过保留在预定义的可配置列表(常量_LIST )中的字段名称来收集字段值。 One or more empty lines complete a returned row. 一个或多个空行完成返回的行。 If a certain field name cannot be found in the currently processed block of the log file, its returned CSV field is empty. 如果在日志文件的当前处理的块中找不到某个字段名称,则其返回的CSV字段为空。 This is the code: 这是代码:

@echo off
setlocal EnableExtensions EnableDelayedExpansion

rem // Define constants here:
set "_INPUT=abc.log"  & rem // (log file to process)
set "_OUTPUT=abc.csv" & rem // (CSV file to return)
set "_LIST=Name,City,Country" & rem /* (comma-separated list of field names, which must
                                rem     not contain any of the following characters:
                                rem     `:`, `,`, `*`, `?`, `<`, `>`, `!`, `"`, `=`) */
set "_SEPARATOR=,"    & rem /* (separator character to be used; the default is `,`;
                        rem     the following separator characters are forbidden:
                        rem     `!`, `^`, `&`, `(`, `)`, `<`, `>`, `|`) */
set "_QUOTED=#"       & rem // (if not empty, defines to quote the returned items)
set "_HEADER=#"       & rem // (if not empty, defines to write a header row)

set "_SEPARATOR=!_SEPARATOR!," & set "_SEPARATOR=!_SEPARATOR:~,1!"
if not defined _QUOTED (set "QUOTE=") else set "QUOTE="^" & rem/^"
for /F "delims==" %%D in ('2^> nul set $ARRAY[') do set "%%D="
for /F %%C in ('^< "abc.log" find /C /V ""') do set /A "COUNT=%%C+1"
< "abc.log" > "abc.csv" (
    set "FLAG=" & if defined _HEADER if defined _LIST (
        echo(%QUOTE%!_LIST:,=%QUOTE%%_SEPARATOR%%QUOTE%!%QUOTE%
    ) else echo(%QUOTE%%QUOTE%
    for /L %%I in (1,1,%COUNT%) do (
        set "LINE=" & set /P LINE=""
        if defined LINE (
            for /F "delims=: eol=:" %%J in ("!LINE!") do set "$ARRAY[%%J]=!LINE:*:=!"
            set "FLAG=#"
        ) else (
            if defined FLAG if defined _LIST (
                set "COLL=" & for %%J in ("!_LIST:,=","!") do (
                    set "COLL=!COLL!%_SEPARATOR%%QUOTE%!$ARRAY[%%~J]!%QUOTE%"
                    set "$ARRAY[%%~J]="
                )
                echo(!COLL:~1!
            ) else echo(%QUOTE%%QUOTE%
            set "FLAG="
        )
    )
)
endlocal
exit /B

This script collects the list items in some kind of array $ARRAY[] whose indexes are the field names, hence the strings left to the (first) colon of every line in a block of the log file, and whose element values are the strings right to the (first) colon, and may look like this (with respect to the first block of your example log data): 该脚本以某种数组$ARRAY[]的形式收集列表项,其索引为字段名称,因此将字符串保留到日志文件块中每一行的(第一)冒号,并且其元素值为字符串(第一个)冒号,并且可能看起来像这样(相对于示例日志数据的第一个块):

 $ARRAY[Name]=Mark $ARRAY[City]=London $ARRAY[Country]=UK 
@echo off
setlocal

set "output=abc.csv"
2> "%output%" echo.

set "line=Name,City,Country"
call :write

for /f "tokens=1,* delims=:" %%A in (abc.log) do call :append %%A %%B
exit /b

:append
setlocal
set  "key=%~1"
set  "value=%~2"
endlocal & (
    if /i "%key%" == "Name" set "line=%value%"
    if /i "%key%" == "City" set "line=%line%,%value%"
    if /i "%key%" == "Country" set "line=%line%,%value%"& call :write
)
exit /b

:write
setlocal
for /f "tokens=1-3 delims=," %%A in ("%line%") do (
    set "a=%%~A          "
    set "b=%%~B          "
    set "c=%%~C          "
)
>> "%output%" echo %a:~,10% %b:~,10% %c:~,10%
set "line="
exit /b

The header is written to file first by setting it to the variable named line and calls the label :write to format and write to the csv output file. 首先通过将标头设置为名为line的变量将标头写入文件,然后调用标签:write格式化并写入csv输出文件。

The for loop splits each line by : with tokens 1,* to get the 1st token before : and the 2nd token as the remainder after the : . for循环将行分隔为:使用标记1,*来获得:之前的第一个标记,而第二个标记则是在:之后的其余部分。 It calls the label :append to concatenate the line based on the 1st token. 它调用标签:append来基于第一个标记连接行。 If the token equals Country , then a call to the label :write formats the line and writes it to the csv output file. 如果令牌等于Country ,则对标签:write的调用:write格式化该行并将其写入csv输出文件。

Your question is unclear in several points, so we can only guess... 您的问题在几点上不清楚,因此我们只能猜测...

@echo off
setlocal EnableDelayedExpansion

rem Put here the width of the output columns
set "width=10"

set "spaces="
for /L %%i in (1,1,%width%) do set "spaces= !spaces!"
set "head=" & "out=" & set "output="
for /F "tokens=1-3 delims=:" %%a in ('findstr /N "^" logData.txt') do (
   if "%%b" neq "" (
      if not defined output (
         set "col=%%b%spaces%"
         set "head=!head!!col:~0,%width%!"
         set "out=!out!^!%%b:~0,%width%^!"
      )
      set "%%b=%%c%spaces%"
   ) else (
      if not defined output (
         echo !head!
         set "output=!out!"
      )
      for /F %%o in ("!output!") do echo %%o
      for %%a in (!head!) do set "%%a=%spaces%"
   )
)

With this logData.txt : 有了这个logData.txt

Name:Mark
City:London
Country:UK

Name:Ben
Country:France

Name:Tom
City:Athens

Name:Antonio
City:Mexico
Country:Mexico

This is the output: 这是输出:

Name      City      Country
Mark      London    UK
Ben                 France
Tom       Athens
Antonio   Mexico    Mexico

This program requires that the first group of data include all the columns, and that the last group of data be followed by an empty line... 该程序要求第一组数据包括所有列,并且最后一组数据后跟一个空行...

A PowerShell solution which doesn't care about the number of address properties. 一个不关心地址属性数量的PowerShell解决方案。
The only constant it requires is an empty line separating the addresses and 它唯一需要的常数是用空行分隔地址和
a colon between property:value property:value之间的冒号

If needed it could be invoked from a batch (to be more on topic) 如果需要,可以从批处理中调用它(将在主题上作更多介绍)

  • It uses Regular Expression for the splitting into sections (addresses), 它使用正则表达式将其拆分为多个部分(地址),
    to split each section into lines and to split each line into property and value. 将每一部分拆分为几行,并将每一行拆分为属性和值。
  • it inserts the properties with value into each new address, 它将带有值的属性插入每个新地址,
  • the adjusting of missing properties in the resulting table is done automagically by PowerShell 在PowerShell中automagically完成对结果表中缺少属性的调整
  • displaying as a table with column width autodetected by Format-Table 显示为具有由Format-Table自动检测到的列宽的表

## Q:\Test\2018\07\04\SO_51166380.ps1
$InputFile = '.\abc.log'
$OutputFile= '.\abc.csv'

$Sections = ((Get-Content $InputFile -Raw) -split "`r?`n *`r?`n" -ne '')

$Csv = ForEach($Section in $Sections){
    $Address = New-Object PSCustomObject
    ForEach($PropVal in ($Section -Split "`r?`n" -ne '')){
        $Prop,$Val = $PropVal.Split(':',2)
        Add-Member -InputObject $Address `
                   -NotePropertyName $Prop `
                   -NotePropertyValue $Val
    }
    $Address
}
$Csv | Format-Table -Auto
$Csv | Export-Csv $OutputFile -NoTypeInformation

Sample output with modified abc.log 修改后的abc.log的示例输出

> type abc.log
Name:Mark
City:London
Country:UK
LastName:Anonymus

Name:Ben
Country:France

Name:Tom
City:Athens

Name:Antonio
City:Mexico
Country:Mexico

> .\SO_51166380.ps1

Name    City   Country LastName
----    ----   ------- --------
Mark    London UK      Anonymus
Ben            France
Tom     Athens
Antonio Mexico Mexico

> type .\abc.csv
"Name","City","Country","LastName"
"Mark","London","UK","Anonymus"
"Ben",,"France",
"Tom","Athens",,
"Antonio","Mexico","Mexico",

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM