简体   繁体   English

在Android中使用jsoup解析html table td内容

[英]html table td contents parsing using jsoup in android

I've with me some html table contents.And for my application I want to parse these html contents using JSOUP parsing in android.But I am new to this JSOUP method and I can't parse those html contents properly. 我已经有了一些html表内容。对于我的应用程序,我想使用android中的JSOUP解析来解析这些html内容。但是我对这种JSOUP方法是陌生的,我无法正确解析这些html内容。

HTML data: HTML数据:

<table id="box-table-a" summary="Tracking Result">
   <thead>
     <tr>
        <th width="20%">AWB / Ref. No.</th>
        <th width="30%">Status</th>
        <th width="30%">Date Time</th>
        <th width="20%">Location</th>
     </tr>
     </thead>
      <tbody>           
        <tr>
          <td width="20%" nowrap="nowrap" class="click"><a href="Javascript:void(0);" onclick="Javascript:   document.frm_Z45681583.submit();">Z45681583</a></td>
                <td width="30%" nowrap="nowrap" class="click">
                IN TRANSIT<div id='ntfylink' style='display:block; text-decoration:blink'><a href='#' class='topopup' name='modal' style='text-decoration:none'><font face='Verdana' color='#DF0000'><blink>Notify Me</blink></font></a></div>                  
                </td>
                <td width="30%">
              Sat, Jan, 31, 2015 07:09 PM                   
                </td>
                <td width="20%">DELHI</td>
              </tr>

            </tbody>
          </table>

from this table I need the"td" contents. 从此表中,我需要“ td”内容。

Any help would be greatly appreciated. 任何帮助将不胜感激。

Everything is described clearly in source code below. 一切都在下面的源代码中清楚地描述。

private static String test(String htmlFile) {
    File input = null;
    Document doc = null;
    Elements tdEles = null;
    Element table = null;
    String tdContents = "";

    try {
        input = new File(htmlFile);
        doc = Jsoup.parse(input, "ASCII", "");
        doc.outputSettings().charset("ASCII");
        doc.outputSettings().escapeMode(EscapeMode.base);

        /** Get table with id = box-table-a **/
        table = doc.getElementById("box-table-a");

        if (table != null) {
            /** Get td tag elements **/
            tdEles = table.getElementsByTag("td");

            /** Loop each of the td element and get the content by ownText() **/
            if (tdEles != null && tdEles.size() > 0) {
                for (Element e: tdEles) {
                    String ownText = e.ownText();

                    //Delimiter as "||"
                    if (ownText != null && ownText.length() > 0)
                        tdContents += ownText + "||";
                }

                if (tdContents.length() > 0) {
                      tdContents = tdContents.substring(0, tdContents.length() - 2);
                }
            }
        }
    } catch (Exception e) {
        e.printStackTrace();
    }

    return tdContents;
}

You can manipulate the String in your textview. 您可以在文本视图中操作String。 All the TD contents is delimited by || 所有TD内容都由||分隔。 . Use String.split() to get each content if you want. 如果需要,使用String.split()获取每个内容。

String[] data = tdContents.split("\\|\\|");

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM