简体   繁体   中英

VB.net basic RegEx problems

Hello I am trying save a value from an input tag in some HTML source code. The tag looks like so:

<input name="user_status" value="3" />

I have the page source in a variable (pageSourceCode), and need to work out some regex to get the value (3 in this example). I have this so far:

Dim sCapture As String = System.Text.RegularExpressions.Regex.Match(pageSourceCode, "\<input\sname\=\""user_status\""\svalue\=\""(.*)?\""\>").Groups(1).Value

Which works fine most of the time, however this code is used to process source code from multiple sites (that use the same platform), and sometimes there are other attributes included in the input tag, or they are in a different order, eg:

<input class="someclass" type="hidden" value="3" name="user_status" />

I just dont understand regex enough to cope with these situations.

Any help very much appreciated.

PS Although i am looking for a specific answer to this question if at all possible, a pointer to a good regex tutorial would be great as well

Thanks

You can search for <input[^>]*\\bvalue="([^"]+)" if your input tags never contain angle brackets.

[^>]* matches any number of characters except > which keeps the regex from accidentally matching across tags.

\\b ensures that we only match value and not something like x_value .

EDIT:

If you only want to look at input tags where name="user_status" , then you can do this with an additional lookahead assertion :

<input(?=[^>]*name="user_status")[^>]*\bvalue="([^"]+)"

In VB.NET:

ResultString = Regex.Match(SubjectString, "<input(?=[^>]*user_status=""name"")[^>]*\bvalue=""([^""]+)").Groups(1).Value

A good tutorial can be found at http://www.regular-expressions.info

Assuming this is an ASP.Net page and not some external HTML you can't control the better solution would be simply to access the control.

Add an ID field to your input control and a runat="server" like this.

<input id="user_status" runat="server" class="someclass" type="hidden" value="3" name="user_status" />

You can probably get rid of the Name field. It's typically the same as the ID field and ID is a better choice. You can actually have both an ID and Name field if you want and they can both be the same value.

In your code behind you can then access the value by the ID with no need for a regex.

Me.user_status.value

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM