[英]Regex from a html parsing, how do I grab a specific string?
I'm trying to specifically get the string after charactername= and before " >. How would I use regex to allow me to catch only the player name? 我正在尝试专门获取charactername =之后和“>之间的字符串。如何使用正则表达式让我仅捕获播放器名称?
This is what I have so far, and it's not working. 到目前为止,这是我所拥有的,并且无法正常工作。 Not working as it doesn't actually print anything.
无法工作,因为它实际上无法打印任何内容。 On the client.DownloadString it returns a string like this:
在client.DownloadString上,它返回如下字符串:
<a href="https://my.examplegame.com/charactername=Atro+Roter" >
So, I know it actually gets string, I'm just stuck on the regex. 因此,我知道它实际上是字符串,只是卡在正则表达式上。
using (var client = new WebClient())
{
//Example of what the string looks like on Console when I Console.WriteLine(html)
//<a href="https://my.examplegame.com/charactername=Atro+Roter" >
// I want the "Atro+Roter"
string html = client.DownloadString(worldDest + world + inOrderName);
string playerName = "https://my.examplegame.com/charactername=(.+?)\" >";
MatchCollection m1 = Regex.Matches(html, playerName);
foreach (Match m in m1)
{
Console.WriteLine(m.Groups[1].Value);
}
}
I'm trying to specifically get the string after charactername= and before " >.
我正在尝试专门在charactername =之后和“>”之前获取字符串。
So, you just need a lookbehind with lookahead and use LINQ to get all the match values into a list: 因此,您只需要先行查找即可,然后使用LINQ将所有匹配值放入列表中:
var input = "your input string";
var rx = new Regex(@"(?<=charactername=)[^""]+(?="")";
var res = rx.Matches(input).Cast<Match>().Select(p => p.Value).ToList();
The res
variable should hold all your character names now. res
变量现在应该包含所有字符名称。
I assume your issue is trying to parse the URL. 我认为您的问题是试图解析URL。 Don't - use what .NET gives you:
不要使用.NET给您的东西:
var playerName = "https://my.examplegame.com/?charactername=NAME_HERE";
var uri = new Uri(playerName);
var queryString = HttpUtility.ParseQueryString(uri.Query);
Console.WriteLine("Name is: " + queryString["charactername"]);
This is much easier to read and no doubt more performant. 这更容易阅读,并且无疑具有更高的性能。
Working sample here: https://dotnetfiddle.net/iJlBKW 此处的工作示例: https : //dotnetfiddle.net/iJlBKW
All forward slashes must be unescaped with back slashes like this \\/ 所有正斜杠必须不与反斜杠这样\\ /
string input = @"<a href=""https://my.examplegame.com/charactername=Atro+Roter"" >";
string playerName = @"https:\/\/my.examplegame.com\/charactername=(.+?)""";
Match match = Regex.Match(input, playerName);
string result = match.Groups[1].Value;
Result = Atro+Roter 结果= Atro + Roter
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.