I wrote a regex to fetch string from HTML, but it seems the multiline flag doesn't work.
This is my pattern and I want to get the text in h1
tag.
var pattern= /<div class="box-content-5">.*<h1>([^<]+?)<\/h1>/mi
m = html.search(pattern);
return m[1];
I created a string to test it. When the string contains "\\n", the result is always null. If I removed all the "\\n"s, it gave me the right result, no matter with or without the /m
flag.
What's wrong with my regex?
You are looking for the /.../s
modifier, also known as the dotall modifier. It forces the dot .
to also match newlines, which it does not do by default.
The bad news is that it does not exist in JavaScript (it does as of ES2018, see below) . The good news is that you can work around it by using a character class (eg \\s
) and its negation ( \\S
) together, like this:
[\s\S]
So in your case the regex would become:
/<div class="box-content-5">[\s\S]*<h1>([^<]+?)<\/h1>/i
As of ES2018, JavaScript supports the s
(dotAll) flag, so in a modern environment your regular expression could be as you wrote it, but with an s
flag at the end (rather than m
; m
changes how ^
and $
work, not .
):
/<div class="box-content-5">.*<h1>([^<]+?)<\/h1>/is
You want the s
(dotall) modifier, which apparently doesn't exist in Javascript - you can replace .
with [\\s\\S] as suggested by @molf. The m
(multiline) modifier makes ^ and $ match lines rather than the whole string.
[\\s\\S]
did not work for me in nodejs 6.11.3. Based on the RegExp documentation , it says to use [^]
which does work for me.
(The dot, the decimal point) matches any single character except line terminators: \\n, \\r, \ or \ .
Inside a character set, the dot loses its special meaning and matches a literal dot.
Note that the m multiline flag doesn't change the dot behavior. So to match a pattern across multiple lines, the character set [^] can be used (if you don't mean an old version of IE, of course), it will match any character including newlines.
For example:
/This is on line 1[^]*?This is on line 3/m
where the *? is the non-greedy grab of 0 or more occurrences of [^].
The dotall modifier has actually made it into JavaScript in June 2018, that is ECMAScript 2018.
https://github.com/tc39/proposal-regexp-dotall-flag
const re = /foo.bar/s; // Or, `const re = new RegExp('foo.bar', 's');`.
re.test('foo\nbar');
// → true
re.dotAll
// → true
re.flags
// → 's'
My suggestion is that it's better to split the multiple-line string with "\\n" and concatenate the splits of the original string and becomes a single line and easy to manipulate.
<textarea class="form-control" name="Body" rows="12" data-rule="required"
title='@("Your feedback ".Label())'
placeholder='@("Your Feedback here!".Label())' data-val-required='@("Feedback is required".Label())'
pattern="^[0-9a-zA-Z ,;/?.\s_-]{3,600}$" data-val="true" required></textarea>
$( document ).ready( function() {
var errorMessage = "Please match the requested format.";
var firstVisit = false;
$( this ).find( "textarea" ).on( "input change propertychange", function() {
var pattern = $(this).attr( "pattern" );
var element = $( this );
if(typeof pattern !== typeof undefined && pattern !== false)
{
var ptr = pattern.replace(/^\^|\$$/g, '');
var patternRegex = new RegExp('^' + pattern.replace(/^\^|\$$/g, '') + '$', 'gm');
var ks = "";
$.each($( this ).val().split("\n"), function( index, value ){
console.log(index + "-" + value);
ks += " " + value;
});
//console.log(ks);
hasError = !ks.match( patternRegex );
//debugger;
if ( typeof this.setCustomValidity === "function")
{
this.setCustomValidity( hasError ? errorMessage : "" );
}
else
{
$( this ).toggleClass( "invalid", !!hasError );
$( this ).toggleClass( "valid", !hasError );
if ( hasError )
{
$( this ).attr( "title", errorMessage );
}
else
{
$( this ).removeAttr( "title" );
}
}
}
});
});
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.