[英]Regex New Line No Match
I'm trying to screen scrape some HTML and am having trouble matching across a new line (in .Net) 我试图在屏幕上刮一些HTML,并且在跨新行(在.Net中)匹配时遇到问题
This is the text: 这是文本:
<td class=abc><span class=label>XXX</span></td>
<td class=def><span class=field>YYY</span></td>
I'm trying to match YYY
with this formula 我正在尝试将
YYY
与该公式匹配
<td class=abc><span class=label>XXX</span></td>\n<td class=def><span class=field>(.*)</span></td>
I have \\n
separating the lines, but it doesn't match... Any ideas? 我已经
\\n
分界线了,但是不匹配...有什么想法吗?
[EDIT] [编辑]
Added \\r\\n
instead of just \\n and it worked. 添加了
\\r\\n
而不只是\\ n,它可以工作。
You need to use the multi-line modifier m
for your regex. 您需要为正则表达式使用多行修饰符
m
。 In VB.NET this is supplied as an option for a regex expression. 在VB.NET中,它作为正则表达式的选项提供。 But you also need to escape all forward-slashes using a backslash:
但是,您还需要使用反斜杠来转义所有正斜杠:
<td class=abc><span class=label>XXX<\/span><\/td>\n<td class=def><span class=field>(.*)<\/span><\/td>
Please note, though, that regex is a very poor way to parse HTML - there are HTML parsers in most languages that do a much better job. 但是请注意,正则表达式是解析HTML的一种非常差的方法-大多数语言中都有HTML解析器做得更好。
And your regex is very detailed and, therefore, brittle; 而且您的正则表达式非常详细,因此很脆弱; an additional space would cause it to fail.
额外的空间将导致它失败。
Note that in Windows newlines are typically created with a carriage-return and newline combination \\r\\n
. 请注意,在Windows中,换行符通常使用回车和换行符
\\r\\n
。
Here is an example supplying the Multiline
option: 这是提供“
Multiline
选项的示例:
Dim rex As New Regex("\bsomething\b", RegexOptions.MultiLine)
Regex Options :MSDN 正则表达式选项 :MSDN
Here I write perl code but in if condition ,used \n new line character not match
#!/usr/bin/perl
use strict;
#use warnings;
use Cwd;
use File::Basename;
use File::Copy;
my $path=getcwd;
#print $path."\n";
opendir(INP, "$path\/");
my @out = grep(/.(xml)$/,readdir(INP));
close INP;
#print @out;
open(F6, ">Log.txt");
foreach my $f1(@out)
{
open(FF, "<$path\/$f1") or die "Cannot open file: $out[0]";
my $data1 = join("", <FF>);
my @FILE_KA_ARRAY = split(/\n/, $data1);
my $file_ka_len = @FILE_KA_ARRAY;
#print F6 $file_ka_len."\n";
#print F6 $f."\t".$file_ka_len."\n";
print F6 $f1."\n";
for(my $x=1; $x<$file_ka_len; $x++)
{
my $y=$x+1;
my $temp_file_arr = "";
$temp_file_arr = $FILE_KA_ARRAY[$x];
#print F6 $temp_file_arr."\t$x\n";
my $temp1=$temp_file_arr;
if($temp1=~m#(<list .*? depth="(\d+)">)\n?(<list .*? depth="(\d+)">)#gs)
{
my $list3=$1;
print F6 "\t\t\t\t\t\t\t\t".$y."\t\t".$list3."\n";
}
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.