简体   繁体   English

需要正则表达式建议

[英]Need Regex Expression Advice

<img[^>]+src\\s*=\\s*['\"]([^'\"]+)['\"][^>]*>

I know this regex expression is used to retrieve the value of src. 我知道这个正则表达式用于检索src的值。 Can anyone teach me how i should interpret this expression? 任何人都可以教我如何理解这个表达方式吗? stucked at it. 坚持下去。

Explaining: 解释:

  • <img matches exactly the string "<img" <img完全匹配字符串"<img"
  • [^>]+ matches multiple times of everything but > , so the tag will not be closed [^>]+匹配除了>的所有内容的多次,因此标记不会被关闭
  • src matches exactly the string "src" src完全匹配字符串“src”
  • \\\\s* matches any number of whitespace characters \\\\s*匹配任意数量的空白字符
  • = matches exactly the string "=" =完全匹配字符串“=”
  • \\\\s* matches any number of whitespace characters \\\\s*匹配任意数量的空白字符
  • ['\\"] matches the two quotes. The double quote is escaped, because otherwise it will terminate the string of the regex ['\\"]匹配两个引号。双引号被转义,否则它将终止正则表达式的字符串
  • ([^'\\"]+) mathches multiple times everything but quotes. The contents are wrapped in brackets, so that they are declared as group and can be retrieved later ([^'\\"]+)数学多次除了引号之外的所有内容。内容用括号括起来,以便它们被声明为组,以后可以检索
  • ['\\"] matches the two quotes. The double quote is escaped, because otherwise it will terminate the string of the regex ['\\"]匹配两个引号。双引号被转义,否则它将终止正则表达式的字符串
  • [^>]* matches the remaining non ">" characters [^>]*匹配剩余的非">"字符
  • > matches exactly the string ">" , the closing bracket of the tag. >完全匹配字符串">" ,即标记的结束括号。

I would not agree this expression is a crap, just a bit complex. 我不同意这个表达是一个废话,只是有点复杂。

EDIT Here you go some examplary code: 编辑在这里你去一些示例代码:

String str = "<img[^>]+src\\s*=\\s*['\"]([^'\"]+)['\"][^>]*>";
String text = "<img alt=\"booo\" src=\"image.jpg\"/>";
Pattern pattern = Pattern.compile (str);
Matcher matcher = pattern.matcher (text);

if (matcher.matches ())
{
      int n = matcher.groupCount ();
      for (int i = 0; i <= n; ++i)
          System.out.println (matcher.group (i));
}

The output is: 输出是:

<img alt="booo" src="image.jpg"/>
image.jpg

So matcher.group(1) returns what you want. 所以matcher.group(1)返回你想要的东西。 experiment a bit with this code. 用这段代码做一点实验。

Hi check one of the tutorials available on the net - eg http://www.vogella.com/articles/JavaRegularExpressions/article.html . 您可以查看网络上提供的其中一个教程 - 例如http://www.vogella.com/articles/JavaRegularExpressions/article.html Section 3.1 and 3.2 common matching symbols explains briefly each symbol and what it replaces as well as metacharacters. 3.1节和3.2节通用匹配符号简要说明了每个符号及其替换的内容以及元字符。 Break what you have here into smaller chunks to understand it easier. 把你在这里的东西分成更小的块,以便更容易理解。 For example you have \\s in two places it is a metacharacter for a whitespace character. 例如,你有两个位置它是空格字符的元字符。 Backslash is an escape character in Java thus you have \\s instead of \\s. 反斜杠是Java中的转义字符,因此你有\\ s而不是\\ s。 After each of them you have a . 在他们每个人之后你有一个 Section 3.3 explains the quantifiers - this particular one means it occurs 0 or more times. 第3.3节解释了量词 - 这个特殊的一个意味着它发生了0次或更多次。 Thus the \\s means "search for a whitespace character that occurs 0 or more times". 因此,\\ s表示“搜索出现0次或更多次的空白字符”。 You do the same with other chunks. 你对其他块做同样的事情。

Hope it helps. 希望能帮助到你。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM