I have a program for extracting urls of a webpage ( WebSource
) with a specific content ( /articles/
)
Dim links As New List(Of String)()
Dim htmlDoc As New HtmlAgilityPack.HtmlDocument()
htmlDoc.LoadHtml(WebSource)
For Each link As HtmlNode In htmlDoc.DocumentNode.SelectNodes("//a[@href]")
Dim att As HtmlAttribute = link.Attributes("href")
If att.Value.Contains("/articles/") Then
links.Add(att.Value)
End If
Next
Is it possible to search in urls and filter them by two value, for example in a tech site i want find all urls contain /articles/
and LG
Extracted urls are not complete HTTP address for example one of my results is
/articles/car
Instead of complete address for example
http://website.com/articles/car
How can i fix this?
you are checking ONE content now . for checking multiple items in htmlagility you can use multiple if
statement as follow
If att.Value.Contains("content1") Then
If att.Value.Contains("content2") Then
If att.Value.Contains("content3") Then
links.Add(att.Value)
End If
End If
End If
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.