繁体   English   中英

从非结构化csv中查找替换文本

[英]Find an replace text from unstructured csv

我有一个带(¬deiminiminator)的日志文件。

073957.744 : Send:[8=FIX.4.4¬9=724¬35=AE¬49=FAUAT¬56=CALUAT¬34=82¬55=0000 AA BBC¬48=0000 AA BBC¬22=100¬38=17000.000000¬9998=Equity¬9999=CFD¬]
080655.776 : Send:[8=FIX.4.4¬9=631¬35=AE¬49=FAUAT¬56=CALUAT¬34=136¬55=NOVN VX CFD¬48=NOVN VX CFD¬22=100¬38=7500.000000¬]
081249.475 : Send:[8=FIX.4.4¬9=620¬35=AE¬49=FAUAT¬56=CALUAT¬34=148¬55=NOK1V FH CFD¬48=NOK1V FH CFD¬22=100¬38=50000.000000¬9896=False¬9893=1¬]
081806.623 : Send:[8=FIX.4.4¬9=583¬35=AE¬49=FAUAT¬56=CALUAT¬34=159¬55=IX17186393-0¬48=IX17186393-0¬22=110¬38=10.000000¬60=20131216-08:09:02¬64=20131219¬552=1¬54=1¬]

我使用以下代码转换csv中的文件并删除前7列

@echo off

rem fetch only the required messages from log file
findstr /r /i Send:\[.*35=AE.* %cd%\FixProvider_MsgLog_20131216_1.log > %cd%\FilteredFIXMessages.log

rem ensure the older temp file is not present
if exist %cd%\FIXTemp1.tmp del %cd%\FIXTemp1.tmp

rem convert the FilteredFIXMessages.log into csv and store it in temp1 file and strip temp1 file for the first 6 columns as they are not required for data matching
setlocal enabledelayedexpansion
for /f "tokens=1-6* delims=¬" %%a in (%cd%\FilteredFIXMessages.log) do set data=%%h & echo !data:=¬,! >> %cd%\FIXTemp1.tmp

exit /b

这给了我以下CSV

55=0000 AA BBC,48=0000 AA BBC,22=100,38=17000.000000,9998=Equity,9999=CFD,]  
55=NOVN VX CFD,48=NOVN VX CFD,22=100,38=7500.000000,]  
55=NOK1V FH CFD,48=NOK1V FH CFD,22=100,38=50000.000000,9896=False,9893=1,]  
55=IX17186393-0,48=IX17186393-0,22=110,38=10.000000,60=20131216-08:09:02,64=20131219,552=1,54=1,]  

你可以看到这不是一个结构化的csv(没有固定的列,列顺序也可能不同),我想剥离

  1. 列55 = *或我想要的任何列(数据可能是可变长度,但列标记是静态的,如55 =等)
  2. 最后一栏,] (空col)

我可以使用VBS轻松剥离它,但由于我使用批处理脚本,我想继续使用它而不安装任何其他工具。 请帮忙。

这是一个混合脚本,它将做到这一点。

::Find and Replace
::Matt Williamson 
::5/30/2013

@echo off
setlocal

call :FindReplace "55=" "" in.txt
call :FindReplace ",]" "" in.txt

exit /b 

:FindReplace <findstr> <replstr> <file>
set tmp="%temp%\tmp.txt"
If not exist %temp%\_.vbs call :MakeReplace
for /f "tokens=*" %%a in ('dir "%3" /s /b /a-d /on') do (
  for /f "usebackq" %%b in (`Findstr /mic:"%~1" "%%a"`) do (
    echo(&Echo Replacing "%~1" with "%~2" in file %%~nxa
    <%%a cscript //nologo %temp%\_.vbs "%~1" "%~2">%tmp%
    if exist %tmp% move /Y %tmp% "%%~dpnxa">nul
  )
)
del %temp%\_.vbs
exit /b

:MakeReplace
>%temp%\_.vbs echo with Wscript
>>%temp%\_.vbs echo set args=.arguments
>>%temp%\_.vbs echo .StdOut.Write _
>>%temp%\_.vbs echo Replace(.StdIn.ReadAll,args(0),args(1),1,-1,1)
>>%temp%\_.vbs echo end with
@ECHO OFF
SETLOCAL
:: Parenthesise a statement-group with redirector sends all echoed text to file
(
 REM This is simply using your regex to feed the lines to FOR
 REM Tokenised - first 5 tokens are skipped, #6 to %%a, remainder of line to %%b
 FOR /f "tokens=6* delims=¬" %%a IN ('findstr /r /i "Send:\[.*35=AE.*" q21191380.txt') DO (
  REM set LINE to token7+(with delimiters) and clear NEWLINE
  SET line=%%b
  SET "newline="
  CALL :process
 )
)>newfile.txt
TYPE newfile.txt

GOTO :EOF

:process
:: Grab the first token in LINE to %%s, part after delimiter to %%t
:: Then set FIELD to "line=nexttoken" and LINE to remaining text
FOR /f "tokens=1*delims=¬" %%s IN ('set line') DO SET "field=%%s"&SET "line=%%t"
:: Remove the leading "line=" from LINE (5 characters)
SET "field=%field:~5%"
:: Vanilla FOR for quoted strings (which bypasses the special status of "=")
:: Set a work variable=FIELD and set string=(element from list - quotes)
FOR %%e IN ("55=" "]") DO SET "work=%field%"&SET "string=%%~e"&CALL :elim
:: ELIM will either clear FIELD or leave it untouched - build & separate
IF DEFINED field SET "newline=%newline%,%field%"
:: If there's any more left in LINE, repeat the process until LINE is empty
IF DEFINED line GOTO process
:: NEWLINE will start with a comma, so ECHO it minus the first character
IF DEFINED newline ECHO %newline:~1%
GOTO :eof

:elim
:: Does the first character of WORK = first of STRING?
IF NOT "%string:~0,1%"=="%work:~0,1%" GOTO :EOF
:: Yes - lop off the first character of both
SET "string=%string:~1%"
SET "work=%work:~1%"
:: If both are still defined, repeat
IF DEFINED string IF DEFINED work GOTO elim
:: If there's anything left to match in STRING, we've found where STRING and WORK differ,
IF DEFINED string GOTO :EOF
:: STRING has been completely matched, so clear FIELD to drop it from output
SET "field="
GOTO :eof

现在有一个有趣的运动!

我已经更改了文件的名称以适合我的系统,但除此之外,应该适合你。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM