简体   繁体   English

Perl脚本可搜索和替换多个html文件中的多行

[英]Perl script to search and replace multiple lines in multiple html files

I have many html files in a folder. 我的文件夹中有很多HTML文件。 I need to somehow remove a <div id="user-info" ...>...</div> from all of them. 我需要以某种方式从所有人中删除<div id="user-info" ...>...</div> As far as I know I need to use a Perl script for that, but I don't know Perl to do that. 据我所知,我需要为此使用Perl脚本,但是我不知道Perl可以做到这一点。 Could someone get it for me? 有人可以帮我吗?

Here is how the "bad" code looks like: 这是“不良”代码的样子:

<div id="user-info" class="logged-in">
    <a class="icon icon-key-delete" href="https://test.dev/login.php?0,logout=1">Log Out</a>
    <a class="icon icon-user-edit" href="https://test.dev/control.php">Control Center</a>


</div> <!-- end of div id=user-info -->

Thank you in advance! 先感谢您!

Using XML::XSH2 : 使用XML :: XSH2

for { glob '*.html' } {
    open :F html (.) ;
    delete //div[@id="user-info" and @class="logged-in"] ;
    save :b ;
}

perl -0777 -i.withdiv -pe 's{<div[^>]+?id="user-info"[^>]*>.*?</div>}{}gsmi;' test.html

-0777 means split on nothing, so slurp in whole file (instead of line by line, the default for -p -0777表示不分割任何内容,因此在整个文件中添加内容(而不是逐行,-p的默认设置

-i.withdiv means alter files in place, leaving original with extension .withdiv (default for -p is to just print). -i.withdiv表示已更改文件,保留扩展名为.withdiv的原始文件(-p的默认设置是仅打印)。

-p means pass line by line (except we are slurping) to passed code (see -e) -p表示一行一行地传递(除非我们进行拖尾)到传递的代码中(请参阅-e)

-e expects code to run. -e期望代码运行。

man perlrun or perldoc perlrun for more info. 有关更多信息, perldoc perlrun man perlrunperldoc perlrun

Here's another solution , which will be slightly more familiar to people that know jquery, as the syntax is similar . 这是另一种解决方案 ,对熟悉jquery的人来说会稍微熟悉一点,因为其语法类似 This uses Mojolicious' ojo module to load up the html content into a Mojo::DOM object, transform it, and then print that transformed version: 这使用Mojolicious的ojo模块将html内容加载到Mojo :: DOM对象中,对其进行转换,然后打印该转换后的版本:

perl -Mojo -MFile::Slurp -E 'for (@ARGV) { say x(scalar(read_file $_))->at("#user-info")->replace("")->root; }' test.html test2.html test*.html

To replace content directly: 要直接替换内容:

perl -Mojo -MFile::Slurp -E 'for (@ARGV) { write_file( $_, x(scalar(read_file $_))->at("#user-info")->replace("")->root ); }' test.html

Note, this won't JUST remove the div, it will also re-write the content based on Mojo's Mojo::DOM module, so tag attributes may not be in the same order. 请注意,这将不只是删除DIV,它也将重新编写基于魔的魔精:: DOM模块上的内容,所以标签的属性可能不会以相同的顺序。 Specifically, I saw <div id="user-info2" class="logged-in"> rewritten as <div class="logged-in" id="user-info2"> . 具体来说,我看到<div id="user-info2" class="logged-in">重写为<div class="logged-in" id="user-info2">

Mojolicious requires at least perl 5.10, but after that there's no non-core requirements. Mojolicious至少需要perl 5.10,但此后没有非核心要求。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM