使用批处理脚本替换XML文件的特定标签中未嵌入的双引号

Question

我有以下数据的CSV文件。 我只想将空白的未嵌入单字符替换为注释标记。此标记可以在单个记录/行中多次出现。我不想影响其他标记和字符。 文件大小约为30MB。

ABCD ,
"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>
<customerDetailsExtension xmlns=\"http://asdfg.net\">
<Comments>
<Comment><Date>2001-12-04</Date><AssociateID>12345</AssociateID>
<AssociateFirstName>ABC</AssociateFirstName>
<Comment>measurements: 34,28,37 height 5'4\". ABC</Comment>
<Priority>false</Priority><IsRead>false</IsRead>
</Comment>
<Comment>
<Date>2001-12-04</Date><AssociateID>12345</AssociateID><AssociateFirstName>ABC</AssociateFirstName>
<Comment>measurements: 32,24.5,34 height 5'3\". ABC</Comment><Priority>false</Priority><IsRead>false</IsRead>
</Comment>
<Comment><Date>2016-12-04</Date><AssociateID>12345</AssociateID><AssociateFirstName>ABC</AssociateFirstName>
<Comment>measurements: 32.5,26,36.5 height 5'5\"  ABC</Comment><Priority>false</Priority><IsRead>false</IsRead>
</Comment>
</Comments>
<EventDate>2017-06-10</EventDate>
</customerDetailsExtension>"

我不了解批处理脚本。 我在下面尝试过，但是没有用。

@echo off

  for /f "delims=, tokens=2" %%A in (
    'findstr /r "<Comment>.*</Comment>" "D:\data.csv"'
  ) do (
    set code=%%A
    set code=!code:"=!
    echo(!code!
)

Answer 1

这应该为你工作

@echo off
setlocal EnableExtensions EnableDelayedExpansion 

>D:\data_new.csv (
  for /f "tokens=*" %%A in (D:\data.csv) do (
    set "code=%%A" & if /I "!code:~0,9!" EQU "<Comment>" set "code=!code:"=!"
    echo(!code!
  )
)  
rem remove the rem in next line to overwrite original file
rem copy /Y D:\data_new.csv D:\data.csv
exit/B

要么

set "code=%%A" & if /I "!code:~0,9!" EQU "<Comment>" set "code=!code:\"=\!"

避免替换另一个引号

Answer 2

findstr是用于解析XML或CSV的错误工具。

您有两个复杂的示例，并且-实际上-如果您想要一个不易碎的解决方案，则可能需要csv解析CSV和XML解析XML。

但是，事实是您试图删除注释中的转义引号，这表明您在做其他肮脏的事情，这是因为引用解析而中断了。 我建议首先，回顾一下您在这里所做的事情，因为这可能是XY问题。

失败了-我可能会做这样的事情：

#!/usr/bin/env perl
use strict;
use warnings;

use Text::ParseWords;
use XML::Twig;
use Data::Dumper;

sub fix_comment {
   my ( $twig, $comment ) = @_;


   my $text = $comment->text;
   $text =~ s/\"//g;
   $comment->set_text($text);

}

#extract quoted-comma separate things.

foreach my $entry (
   quotewords(
      ",", 0,
      do { local $/; <DATA> }
   )
  )
{

   if ( $entry =~ m/^\s*<\?xml/ms ) {
      $entry =~ s/^\s+//ms;

      #eval so we can fail gracefully if this doesn't work.
      my $twig = XML::Twig->new(
         pretty_print  => 'indented',
         twig_handlers => { 'Comment/Comment' => \&fix_comment }
      );
      eval { $twig->parse($entry) };
      if ($@) { warn $@ }
      else {
         $entry = $twig->sprint;
      }
   }
   print $entry;
}


__DATA__
DATA , " test ", 
"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>
<customerDetailsExtension xmlns=\"http://asdfg.net\">
<Comments>
<Comment><Date>2001-12-04</Date><AssociateID>12345</AssociateID>
<AssociateFirstName>ABC</AssociateFirstName>
<Comment>measurements: 34,28,37 height 5'4\". ABC</Comment>
<Priority>false</Priority><IsRead>false</IsRead>
</Comment>
<Comment>
<Date>2001-12-04</Date><AssociateID>12345</AssociateID><AssociateFirstName>ABC</AssociateFirstName>
<Comment>measurements: 32,24.5,34 height 5'3\". ABC</Comment><Priority>false</Priority><IsRead>false</IsRead>
</Comment>
<Comment><Date>2016-12-04</Date><AssociateID>12345</AssociateID><AssociateFirstName>ABC</AssociateFirstName>
<Comment>measurements: 32.5,26,36.5 height 5'5\"  ABC</Comment><Priority>false</Priority><IsRead>false</IsRead>
</Comment>
</Comments>
<EventDate>2017-06-10</EventDate>
</customerDetailsExtension>",

这确实不是完美的，因为我不能完全确定是否正确捕获了换行符-Text Text::CSV可能是解决该问题的更合适的方法。 很难说。

使用批处理脚本替换XML文件的特定标签中未嵌入的双引号

问题描述

2 个解决方案

解决方案1
0 2016-12-13 12:23:17

解决方案2
0 2016-12-14 09:45:47

使用批处理脚本替换XML文件的特定标签中未嵌入的双引号

问题描述

2 个解决方案

解决方案1 0 2016-12-13 12:23:17

解决方案2 0 2016-12-14 09:45:47

解决方案1
0 2016-12-13 12:23:17

解决方案2
0 2016-12-14 09:45:47