简体   繁体   English

在jsoup中的元素中解析元素?

[英]Parsing elements within elements in jsoup?

Recently started programming Android Java (Eclipse), Im trying to make a simple reader app using jsoup. 最近开始编程Android Java(Eclipse),我试图使用jsoup创建一个简单的阅读器应用程序。

Ive got html like this; 我有这样的HTML;

<article id="id" class="artikel">
<h1>Title</h1>
<p>paragraph 1</p>
<p>paragraph 2</p>
<p>paragraph 3</p>
</article>

<article id="id">
<p>comment1</p>
</article>

<article id="id">
<p>comment2</p>
</article>

Amounts of paragraphs is variable. 段落的数量是可变的。 The amount of comments as well. 评论数量也是如此。 I want to get all the paragraphs within the article, none of the comments. 我想获得文章中的所有段落,没有任何评论。 The real article is always the first article tag, so Im using first() in combination with a wildcard to get it. 真正的文章始终是第一个文章标签,因此Im结合使用first()和通配符来获取它。

Here is the method Im using; 这是Im使用的方法;

public String GetArticleBody(Document adoc)
{
    //Document totalbody = (Document)adoc.select("article *").first();
    //Element totalbody = adoc.select("article *").first();
    //Elements paragraphs = adoc.select("article * > p);
    Elements paragraphs = adoc.select(".article* p");
    String body = "test";
    for (Element p : paragraphs)
    {
        body = StringAttacher(body, p.text());
    }
    System.out.println(body);
    return body;
}

As you can see Ive been fooling around with the methods from the cookbook and a few I found on SOF. 如您所见,我一直在研究菜谱中的方法以及我在SOF上发现的一些方法。 From all of these methods all Ive ever gotten back is just the word test or nothing at all. 从所有这些方法中,我得到的全部只是单词test或什么都没有。

Could someone point me in the right direction to get those paragraphs? 有人可以指出正确的方向来获取这些段落吗?

The issue you have is using the wrong selector in your first statement. 您遇到的问题是在第一条语句中使用了错误的选择器。

. is the "class" selector, so you either hav "article" speller wrong, or have a . 是“类”选择器,因此您“文章”拼写错误或具有. when you shouldn't. 当你不应该。

Try this instead: 尝试以下方法:

public String GetArticleBody(Document adoc)
{
    //Document totalbody = (Document)adoc.select("article *").first();
    //Element totalbody = adoc.select("article *").first();
    //Elements paragraphs = adoc.select("article * > p);
    Elements paragraphs = adoc.select("article").first().select("p");
    String body = "test";
    for (Element p : paragraphs)
    {
        body = StringAttacher(body, p.text());
    }
    System.out.println(body);
    return body;
}

This will get you the paragraphs in the first article. 这将使您获得第一篇文章中的段落。

Also, it often helps to remember that the jsoup selectors are the same as the ones used in CSS selectors (and a sub-set of the jQuery selectors). 而且,通常有助于记住jsoup选择器与CSS选择器(以及jQuery选择器的子集)中使用的选择器相同。 Any knowledge you have from those other areas is directly useable with jsoup. 您从其他领域获得的任何知识都可以直接用于jsoup。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM