[英]Java - JSOUP: selecting a specific part of a website
I try to readout the Office 365 Website to compare it to a proxy configuration. 我尝试读出Office 365网站以将其与代理配置进行比较。 But i cant get the select right so that it just gets me a specific section of those urls and ip addresses. 但是我无法正确选择,因此只能让我获得这些URL和IP地址的特定部分。
public class Office365WebsiteParser {
Document doc = null;
String WebseitenInhalt;
public void Parser() {
System.setProperty("http.proxyHost", "xxx");
System.setProperty("http.proxyPort", "8081");
System.setProperty("https.proxyHost", "xxx");
System.setProperty("https.proxyPort", "8081");
for (int i = 1; i <= 5; i++) {
try {
doc = Jsoup.connect("https://technet.microsoft.com/de-de/library/hh373144.aspx").userAgent("Mozilla").get();
break; // Break immediately if successful
} catch (IOException e) {
// Swallow exception and try again
System.out.println("jsoup Timeout occurred " + i + " time(s)");
}
}
if (doc == null) {
System.out.println("Connection timeout after 5 tries");
} else { // Wenn alles funktioniert hat Webseite auswerten
Elements urls_Office365_URLs = doc.select("div.codeSnippetContainerCode");
// HTML auswahl der Webseite nach div class und div id
// urls_Office365_URLs_global = urls_Office365_URLs;
WebseitenInhalt=urls_Office365_URLs.text();
}
}
public void Print() {
System.out.println(WebseitenInhalt);
}
public String get() {
return WebseitenInhalt;
}
}
I just want to select the containers like this: 我只想选择这样的容器:
<div class="codeSnippetContainerCodeContainer"> <div class="codeSnippetToolBar"> <div class="codeSnippetToolBarText"> <a name="CodeSnippetCopyLink" style="display: none;" title="In Zwischenablage kopieren" href="javascript:if (window.epx.codeSnippet)window.epx.codeSnippet.copyCode('CodeSnippetContainerCode_0f6f9acf-6aa4-471f-8600-f8d059f95493');">Kopieren</a> </div> </div> <div id="CodeSnippetContainerCode_0f6f9acf-6aa4-471f-8600-f8d059f95493" class="codeSnippetContainerCode" dir="ltr"> <div style="color:Black;"><pre> *.live.com *.officeapps.live.com *.microsoft.com *.glbdns.microsoft.com *.microsoftonline.com *.office365.com *.office.com Portal.Office.com *.onmicrosoft.com *.microsoftonline-p.com^ *.microsoftonline-p.net^ *.microsoftonlineimages.com^ *.microsoftonlinesupport.net^ *.msecnd.net^ *.msocdn.com^ *.msn.com^ *.msn.co.jp^ *.msn.co.uk^ *.office.net^ *.aadrm.com^^ *.cloudapp.net^^ *.activedirectory.windowsazure.com^^^ *.phonefactor.net^^^ </pre></div> </div> </div> </div>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.