简体   繁体   English

如何使用jsoup从特定的href获取文本?

[英]How to get text from specific href with jsoup?

I get text from http://m.wol.jw.org/en/wol/dt/r1/lp-e/2014/6/26 via jsoup in my android app. 我通过android应用程序中的jsoup从http://m.wol.jw.org/en/wol/dt/r1/lp-e/2014/6/26获得了文本。 It looks like: 看起来像:

public static void refreshFromNetwork(Context context) {
    Document document;
    Elements dateElement;
    Elements textElement;
    Elements commentElement;
    try {
        Calendar calendar = Calendar.getInstance();
        int year = calendar.get(Calendar.YEAR);
        int month = calendar.get(Calendar.MONTH) + 1;
        int day = calendar.get(Calendar.DAY_OF_MONTH);
        sDayURL = sURL + "/" + year + "/" + month + "/" + day;

        document = Jsoup.connect(sDayURL).get();
        if (document.hasText()) {
            dateElement = document.select(".ss");
            textElement = document.select(".sa");
            commentElement = document.select(".sb");

            sDate = dateElement.text();
            sText = textElement.text();
            sComment = commentElement.html();
            sSavedForCheckingDate = sLocalDate;
            savePrefs(context);
            sDayURL = null;
        } else {
            Toast.makeText(mContext,
                    mContext.getString(R.string.warning_unstable_connection),
                    Toast.LENGTH_SHORT).show();
        }
    } catch (IOException e) {
        System.out.println("error");
        e.printStackTrace();
    }
}

But there are some hrefs in text. 但是文本中有一些href。 When the cursor is on them, pops up with text frame. 当光标位于其上时,将弹出带有文本框的窗口。 I can't post images, so see it there: http://habrastorage.org/files/45e/b09/17f/45eb0917f3644bbd9e5ea2b79d98363d.png 我无法发布图片,因此请在此处查看: http : //habrastorage.org/files/45e/b09/17f/45eb0917f3644bbd9e5ea2b79d98363d.png

But when I try to get text from that href (I get it from sComment with html), it returns me all the text (which displays when I click on href), not part of it, like in popup. 但是,当我尝试从该href中获取文本时(我是从sComment中获取并带有html的),它将返回我所有的文本(当我单击href时显示),而不是其中的一部分,例如在弹出窗口中。 I'm not a web developer, so I don't understand, how to get only the desired text. 我不是网络开发人员,所以我不明白如何仅获取所需的文本。 How can I do it? 我该怎么做?

Use sComment = commentElement.text(); 使用sComment = commentElement.text(); instead. 代替。

Follow the snapshot below to get only the text on pop-up 请按照以下快照操作,仅在弹出窗口中显示文本

Click the pop-up href 点击弹出的href

See the text the popup text is on the this page also, to extract only the text shown on popup simply use this class and display the contents 还要查看此页面上弹出文本的文本,仅提取此类显示的文本,只需使用此类并显示内容

When you click on the link href, a new page open with the same text with red font this is the text you need as it is the pop-up text, now you have just use 当您单击链接href时,将打开一个新页面,该页面带有相同的红色字体文本,这是您需要的文本,因为它是弹出文本,现在您只需使用

String Href=Scomment.attr("href");
Document doc=Jsoup.connect(Href).get();
Element element= doc.getElementById("p101");
String dialogtext=element.text();

This is the solution to you question. 这是您问题的解决方案。 Hope it'll help you 希望对你有帮助

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM