用jsoup解析表数据

Question

I am using jsoup in my android app to parse my html code but now I need parse table data and I can not get it to work. 我在我的Android应用程序中使用jsoup解析我的html代码，但是现在我需要解析表数据，但无法正常工作。 I try many ways but not successful so I want try luck here if anyone have experience. 我尝试了很多方法，但都没有成功，所以如果有人有经验，我想在这里试试运气。

Here is part of my html: 这是我的html的一部分：

<div id="editacia_jedla">
    <h2>My header</h2>
    <h3>My sub header</h3>

    <table border="0" class="jedalny_listok_tabulka" cellpadding="2" cellspacing="1">
    <tr>
        <td width="100" class="menu_nazov neparna" align="left">Food Menu 1</td>
        <td class="jedlo neparna" align="left">vegetable and beef
        <div class="jedlo_box_alergeny">Allergens: <a href="#" class="alergen_1">1</a>, <a href="#" class="alergen_3">3</a></div>
        </td>
    </tr>
    <tr>
        <td width="100" class="menu_nazov parna" align="left">Food Menu 2</td>
        <td class="jedlo parna" align="left">Potato salad and pork
        <div class="jedlo_box_alergeny">Allergens: <a href="#" class="alergen_6">6</a></div>
        </td>
    </tr>
    </table>  
    etc
</div>

My java/android code: 我的Java / Android代码：

try {
            String tableHtmlCode="";
            Document fullHtmlDocument = Jsoup.connect(urlOfFoodDay).get();
            Element elm1 = fullHtmlDocument.select("#editacia_jedla").first();
            for( Element element : elm1.children() )
            {
                tableHtmlCode+=element.getElementsByIndexEquals(2); //this set table content because 0=h2, 1=h3
            }
            Document parsedTableDocument = Jsoup.parse(tableHtmlCode);
            //Element th = parsedTableDocument.select("td[class=jedlo neparna]").first();  THIS IS BAD
            String foodContent="";
            String foodAllergens="";
        }

So now I want extract text vegetable and beef and save it to string foodContent and numbera 1, 3(together) from div class jedlo_box_alergeny save to string foodAllergens. 所以现在我要提取文本蔬菜和牛肉 ，并将其保存到div类jedlo_box_alergeny的 foodContent和numbera 1、3（一起）中，并保存到string foodAllergens。 Someone can help? 有人可以帮忙吗？ I will very grateful for any ideas 我会很感激任何想法

Answer 1

Iterate over your document's parent tag jedalny_listok_tabulka and loop over td tags. 遍历文档的父标记jedalny_listok_tabulka并遍历td标记。

td tag is the parent to href tags which include the allergy values. td标签是href标签的父标签， href标签包括过敏值。 Hence, you would loop over the tags a elements to get your numbers, something like: 因此，你会遍历所有的标签a元素，让您的数字，是这样的：

Elements myElements = doc.getElementsByClass("jedalny_listok_tabulka")
                .first().getElementsByTag("td");
        for (Element element : myElements) {
            if (element.className().contains("jedlo")) {
                String foodContent = element.ownText();
                String foodAllergen = "";

                for (Element href : element.getElementsByTag("a")) {
                    foodAllergen += " " + href.text();
                }

                System.out.println(foodContent + " : " + foodAllergen);
            }
        }

Output: 输出：

vegetable and beef :  1 3
Potato salad and pork :  6

用jsoup解析表数据

问题描述

1 个解决方案

解决方案1
2 已采纳 2014-02-05 13:48:38

用jsoup解析表数据

问题描述

1 个解决方案

解决方案1 2 已采纳 2014-02-05 13:48:38

解决方案1
2 已采纳 2014-02-05 13:48:38