I have the following
<option value="Abercrombie">Abercrombie</option>
My file has about 2000 rows in it each row has a different location, I'm trying to understand regex but unfortunately nothing I learn will go in and I'm unsure if this is possible.
What I want to do is run a regex which will strip the above HTML which will leave the following
Abercrombie
I then want to prefix a particular number to the front so the result would be for example
2,Abercrombie
Is this possible?
Don't use a regular expression since HTML is not a regular language. You can use Linq's XML parser. If you want to process the entire file, you can replace the elements inline:
int myNumber = 2;
var html = @"<html><body><option value=""Abercrombie"">Abercrombie</option><div><option value=""Forever21"">Forever21</option></div></body></html>";
var doc = XDocument.Load(new StringReader(html));
var options = doc.Descendants().Where(o => o.Name == "option").ToList();
foreach (var element in options)
{
element.ReplaceWith(string.Format("{0},{1}", myNumber, element.Value));
}
var result = doc.ToString();
This gives:
<html>
<body>2,Abercrombie<div>2,Forever21</div></body>
</html>
If you just want to grab the text for a specific tag, you can use the following:
int myNumber = 2;
var html = @"<option value=""Abercrombie"">Abercrombie</option>";
var doc = XDocument.Load(new StringReader(html));
var element = doc.Descendants().FirstOrDefault(o => o.Name == "option");
var attribute = element.Attribute("value").Value;
var result = string.Format("{0},{1}", myNumber, attribute);
//result == "2,Abercrombie"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.