I have the following regex (long, I know):
(?-mix:((?-mix:(?-mix:\{\%).*?(?-mix:\%\})|(?-mix:\{\{).*?(?-mix:\}\}?))
|(?-mix:\{\{|\{\%)))
that I'm using to split a string. It matches correctly in C#, but when I moved the code to Java, it doesn't match. Is there any particular feature of this regex that is C#-only?
The source is produced as:
String source = Pattern.quote("{% assign foo = values %}.{{ foo[0] }}.");
While in C# it's:
string source = @"{% assign foo = values %}.{{ foo[0] }}.";
The C# version is like this:
string[] split = Regex.split(source, regex);
In Java I tried both:
String[] split = source.split(regex);
and also
Pattern p = Pattern.compile(regex);
String[] split = p.split(source);
Here is a sample program with your code: http://ideone.com/hk3uy
There is a major difference here between Java and other languages: Java does not add captured groups as tokens in the result array ( example ). That means that all delimiters are removed from result, though they would be included in .Net.
The only alternative I know is not to use split
, but getting a list of matches and splitting manually.
I think the problem is with how you're defining source
. On my system, this:
String source = Pattern.quote("{% assign foo = values %}.{{ foo[0] }}.");
is equivalent to this:
String source = "\\Q{% assign foo = values %}.{{ foo[0] }}.\\E";
(that is, it adds a stray \\Q
and \\E
), but the way the method is defined, your Java implementation could treat it as equivalent to this:
String source = "\\{% assign foo = values %\\}\\.\\{\\{ foo\\[0\\] \\}\\}\\.";
(that is, inserting lots of backslashes).
Your regex itself seems fine. This program:
public static void main(final String... args)
{
final Pattern p = Pattern.compile("(?-mix:((?-mix:(?-mix:\\{\\%).*?(?-mix:\\%\\})|(?-mix:\\{\\{).*?(?-mix:\\}\\}?))|(?-mix:\\{\\{|\\{\\%)))");
for(final String s : p.split("a{%b%}c{{d}}e{%f%}g{{h}}i{{j{%k"))
System.out.println(s);
}
prints
a
c
e
g
i
j
k
that is, it successfully treats {%b%}
, {{d}}
, {%f%}
, {{h}}
, {{
, and {%
as split-points, with all the non-greediness you'd expect. But tor the record, it also works if I strip p
down to just
Pattern.compile("\\{%.*?%\\}|\\{\\{.*?\\}\\}?|\\{\\{|\\{%");
;-)
使用\\\\{
而不是\\{
和其他符号
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.