[英]How to account for special non ASCII characters in regex
I dont know if this is the issue but I can't seem to get this to match. 我不知道这是否是问题,但我似乎无法得到这个匹配。
String [] seTab3_HighRes=null;
public Map<String, String> tab3HighResRegex(String x, Map<String,String> map) {
Pattern Tab3_HighRes_pattern = Pattern.compile("High Resolution Parameters:(.*?Intrabolus pressure)",Pattern.DOTALL);
Matcher matcherTab3_HighRes_pattern = Tab3_HighRes_pattern.matcher(x);
while (matcherTab3_HighRes_pattern.find()) {
System.out.println("Anything here? Nope");
seTab3_HighRes=matcherTab3_HighRes_pattern.group(1).split("\\n|\\r");
}
}
The text is: 案文是:
High Resolution Parameters:
Intrabolus pressure (@LESR)(mmHg):-3.7 <8.4
Some other stff: 123
Intrabolus pressure (avg max)(mmHg):8.3 <17.0
I looked a bit more into the text and noticed there's a ^G
character at the end of High Resolution Parameters:
when I paste the text into textpad. 我在文本中看了一下,注意到
High Resolution Parameters:
末尾有一个^G
字符High Resolution Parameters:
当我将文本粘贴到textpad中时。 What is it and is that the reason I'm not getting a match (and how to get rid of it? 它是什么,是因为我没有得到匹配(以及如何摆脱它?
You could simply just match the ^G
control G with \\cG
, 你可以简单地将
^G
控制G与\\cG
,
This regex does the following: 这个正则表达式执行以下操作:
High Resolution Parameters:
High Resolution Parameters:
Intrabolus pressure
Intrabolus pressure
Intrabolus pressure ... :
Intrabolus pressure ... :
后拉出子串Intrabolus pressure ... :
The regex 正则表达式
High\sResolution\sParameters:(?:\cG|[\n\r\s])*(?:Intrabolus\spressure)[^:]*:([^\n]*)
https://regex101.com/r/pE5aI0/1 https://regex101.com/r/pE5aI0/1
Intrabolus pressure
value Intrabolus pressure
值 Expanded 扩展
NODE EXPLANATION
----------------------------------------------------------------------
High 'High'
----------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
Resolution 'Resolution'
----------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
Parameters: 'Parameters:'
----------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
----------------------------------------------------------------------
\cG ^G
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
[\n\r\s] any character of: '\n' (newline), '\r'
(carriage return), whitespace (\n, \r,
\t, \f, and " ")
----------------------------------------------------------------------
)* end of grouping
----------------------------------------------------------------------
(?: group, but do not capture:
----------------------------------------------------------------------
Intrabolus 'Intrabolus'
----------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
pressure 'pressure'
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
[^:]* any character except: ':' (0 or more times
(matching the most amount possible))
----------------------------------------------------------------------
: ':'
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
[^\n]* any character except: '\n' (newline) (0
or more times (matching the most amount
possible))
----------------------------------------------------------------------
) end of \1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.