簡體   English   中英

Java-JSOUP:選擇網站的特定部分

[英]Java - JSOUP: selecting a specific part of a website

我嘗試讀出Office 365網站以將其與代理配置進行比較。 但是我無法正確選擇,因此只能讓我獲得這些URL和IP地址的特定部分。

public class Office365WebsiteParser {

    Document doc = null;


    String WebseitenInhalt;

    public void Parser() {
        System.setProperty("http.proxyHost", "xxx");
        System.setProperty("http.proxyPort", "8081");
        System.setProperty("https.proxyHost", "xxx");
        System.setProperty("https.proxyPort", "8081");

        for (int i = 1; i <= 5; i++) {
            try {
                doc = Jsoup.connect("https://technet.microsoft.com/de-de/library/hh373144.aspx").userAgent("Mozilla").get();
                break; // Break immediately if successful
            } catch (IOException e) {
                // Swallow exception and try again
                System.out.println("jsoup Timeout occurred " + i + " time(s)");
            }
        }

        if (doc == null) {
            System.out.println("Connection timeout after 5 tries");
        } else { // Wenn alles funktioniert hat Webseite auswerten

            Elements urls_Office365_URLs = doc.select("div.codeSnippetContainerCode");


            // HTML auswahl der Webseite nach div class und div id
        //  urls_Office365_URLs_global = urls_Office365_URLs;

            WebseitenInhalt=urls_Office365_URLs.text();
        }

    }

    public void Print() {
        System.out.println(WebseitenInhalt);
    }

    public String get() {
        return WebseitenInhalt;
    }
}

我只想選擇這樣的容器:

 <div class="codeSnippetContainerCodeContainer"> <div class="codeSnippetToolBar"> <div class="codeSnippetToolBarText"> <a name="CodeSnippetCopyLink" style="display: none;" title="In Zwischenablage kopieren" href="javascript:if (window.epx.codeSnippet)window.epx.codeSnippet.copyCode('CodeSnippetContainerCode_0f6f9acf-6aa4-471f-8600-f8d059f95493');">Kopieren</a> </div> </div> <div id="CodeSnippetContainerCode_0f6f9acf-6aa4-471f-8600-f8d059f95493" class="codeSnippetContainerCode" dir="ltr"> <div style="color:Black;"><pre> *.live.com *.officeapps.live.com *.microsoft.com *.glbdns.microsoft.com *.microsoftonline.com *.office365.com *.office.com Portal.Office.com *.onmicrosoft.com *.microsoftonline-p.com^ *.microsoftonline-p.net^ *.microsoftonlineimages.com^ *.microsoftonlinesupport.net^ *.msecnd.net^ *.msocdn.com^ *.msn.com^ *.msn.co.jp^ *.msn.co.uk^ *.office.net^ *.aadrm.com^^ *.cloudapp.net^^ *.activedirectory.windowsazure.com^^^ *.phonefactor.net^^^ </pre></div> </div> </div> </div> 

試試這個CSS選擇器:

table:has(th:matches(.+-URLs?)) td:first-of-type pre

DEMO

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM