I am trying to create a script using powershell to track Adobe Reader latest releases. Using the URL link lists the latest releases. I did not get very far using powershell web scraping.
$t = "https://www.adobe.com/devnet-docs/acrobatetk/tools/ReleaseNotesDC/"
$r = Invoke-WebRequest -uri $t
$r.ParsedHtml.body.getElementsByTagName('Div')
You could target the raw HTML code with regular expressions to extract the update information. One way would be to split on the <li>
and iterate using a switch statement.
$baseuri = 'https://www.adobe.com/devnet-docs/acrobatetk/tools/ReleaseNotesDC'
$response = Invoke-WebRequest -Uri $baseuri -UseBasicParsing
$updatelist = switch -Regex ($response.Content -split '<li>'){
'href="(?<URL>.+?)".+?(?<Version>\d{2}\.\d+?\.[\d\w]+?) (?<Type>[\w\s]+?), (?<Date>\w+? \d+, \d+)' {
[PSCustomObject]@{
Version = $matches.Version
Type = $matches.Type
Date = $matches.Date
URL = "$baseuri/{0}" -f $matches.Url
}
}
'href="(?<URL>.+?)".+?(?<Win>\d{2}\.\d+?\.[\d\w]+? \(Win\)), (?<Mac>\d{2}\.\d+?\.[\d\w]+? \(Mac\)) (?<Type>[\w\s]+?), (?<Date>\w+? \d+, \d+)' {
$ht = [ordered]@{
Version = $matches.win
Type = $matches.Type
Date = $matches.Date
URL = "$baseuri/{0}" -f $matches.Url
}
[PSCustomObject]$ht
$ht.Version = $matches.mac
[PSCustomObject]$ht
}
}
The list will be stored in the variable $updatelist
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.