i have a simple regex expression below to pull out the value within a string that is surrounded by end**end, example below. However, although it's stupidly simple im struggling to get the results I need! Is there something obvious I'm missing! Many thanks as always.
var str = "endhelloend";
var match = Regex.Match(str, @"end([a-z]+)end$", RegexOptions.IgnoreCase);
if(match.Success)
{
result = match.Groups[0].Value // should return 'hello'
}
Your pattern correctly contains the group you want to extract. A regular expression match will contain a collection of groups for you to access. In your example, try the following:
var str = "endhelloend";
var match = Regex.Match(str, @"end([a-z]+)end$", RegexOptions.IgnoreCase);
if(match.Success)
{
var hello = match.Groups[1];
}
match.Groups[0] will return the entire match "endhelloend" so you just want the 1st group within the match.
match.Groups [0]将匹配整个正则表达式-查看match.Groups [1]。
我认为这一行应如下所示: result = match.Groups[1].Value;
I see you're struggling with this so I will offer a little insight.
This regex end([az]+)end$
will match this string " endhelloend
".
The inner text will be in capture group 1.
It will not match the same string when its a substring like this
" endhelloend of the world
".
The reason is you have an end of string metachar (assertion) $
as part of the regex
just after 'end'.
So you could just take out $
in the regex and it should work fine.
There are other things to take into account though. I'll comment it in you're regex.
end // find a literal 'end'
( // Capture group 1 open
[a-z]+ // Find as many characters a-z as possible (including 'e' 'n' 'd' ins sequence
) // Capture group 1 close
end // find a literal 'end'
$ // End of string assertion (the last 'end' must be the last word in the string)
Use solution 1 to extract .html text content and then filter your desired text from text by using solution 2 .
To clean html elements within .htm file, try this:
string CleanXml(string DirtyXml) { //string clean = ""; int startloc = 0, endloc = 0; for (int x = 0; x <= DirtyXml.Length-1; x++) { if (DirtyXml[x] == '<') { startloc = x; x++; } if (DirtyXml[x] == '>') { endloc = x; x++; DirtyXml = DirtyXml.Remove(startloc, (endloc - startloc)+1); x = 0; } } return DirtyXml; }
Regex to filter text "endhelloend" to obtain "hello"
string result = ""; var str = "endhelloend"; var match = Regex.Match(str, @"end([az]+)end$", RegexOptions.IgnoreCase); if(match.Success) { result = match.Groups[1].Value; // Returns 'hello' } Console.WriteLine(result); Console.ReadLine();
尝试此操作,它将为您提供单词end之间的任何字母字符,但不会捕获实际的单词end
(?<=end)[a-z]+?(?=end)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.