简体   繁体   English

限制jsoup检索的内容

[英]Limiting what jsoup retrieves

I'm Having fun learning to use jsoup and have successfully retrieved and displayed data from a website, but now I would like some further guidance on it if anyone can help. 我在学习使用jsoup的过程中很有趣,并且已经成功地从网站上检索和显示了数据,但是现在,如果有人可以帮助,我希望获得一些进一步的指导。

Using the code below returns all the table rows 30+, How can I retrieve only say the first 10 of those rows? 使用下面的代码返回所有表行30+,如何只说出这些行的前10行?

also

When returning those rows and the data on them there are gaps/spaces in the row between the data, the spaces between rows are fine but its the spaces within the row that I want to get rid of, how can I omit those spaces/gaps? 当返回这些行及其上的数据时,数据之间的行中存在间隙/空格,行之间的空格很好,但它是我想摆脱的行中的空格,我该如何忽略这些空格/间隙?

My code so far... 到目前为止,我的代码...

package com.example.shiftzer;

import java.io.IOException;
import java.util.ArrayList;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

import android.app.Activity;
import android.content.SharedPreferences;
import android.os.AsyncTask;
import android.os.Bundle;
import android.widget.ArrayAdapter;
import android.widget.ListView;
import android.widget.TextView;

public class MainActivity extends Activity{

TextView textView1;
ListView shippingList; 

  public static final String APP_PREFERENCES = "AppPrefs";
    SharedPreferences settings; 
    SharedPreferences.Editor prefEditor;

   @Override
     public void onCreate(Bundle savedInstanceState) {         
        super.onCreate(savedInstanceState);    
        setContentView(R.layout.main_activity);
        //rest of the code

       textView1 = (TextView)findViewById(R.id.textView1);
       shippingList = (ListView) findViewById(R.id.listView1);

       settings = getSharedPreferences(APP_PREFERENCES, MODE_PRIVATE);
       prefEditor = settings.edit();

       new VTSTask().execute();//starts AsyncTask in private class VTSTask to get      shipping info
    }

   private class VTSTask extends AsyncTask<Void, Void, ArrayList<String>> {
       ArrayList<String> arr_shipping=new ArrayList<String>();
        /**
         * @param args
         */
        @Override
        protected ArrayList<String>  doInBackground(Void... params) {

            Document doc;
            String shippingList;

            try {
                doc =   Jsoup.connect("https://vts.mhpa.co.uk/main_movelistb.asp").get(); 
                Elements tableRows = doc.select("table.dynlist tr   td");

                 for (Element element : tableRows) {
                      shippingList = element.text();
                      arr_shipping.add(shippingList);// add value to  ArrayList
                    } 
                 } catch (IOException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }      

            return arr_shipping;//<< Return ArrayList from here
        }

         @Override
         protected void onPostExecute(ArrayList<String> result) {        
             //TextView tVShipping= (TextView)findViewById(R.id.textView2);

             shippingList = (ListView) findViewById(R.id.listView1);
             ArrayAdapter<String> adapter = 
                 new ArrayAdapter<String>(MainActivity.this, 
                                           android.R.layout.simple_list_item_1, 
                                          android.R.id.text1);

             for (String shipping_result : result)
             {
                adapter.add(shipping_result);
             }

             // Assign adapter to ListView
             shippingList.setAdapter(adapter); 

          }
    }


}

Thank you. 谢谢。

EDIT: 编辑:

try {
                doc = Jsoup.connect("https://vts.mhpa.co.uk/main_movelistb.asp").get(); 
                Elements tableRows = doc.select("table.dynlist tr td");

                tableRows.size();
                        for(int i = 0; i < 10; i++){
                                  tableRows.get(i);
                   shippingList  = tableRows.get(i).text() +"\n";

                      arr_shipping.add(shippingList);// add value to ArrayList
                    } 
                 } catch (IOException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }      

            return arr_shipping;//<< return ArrayList from here
        }

Instead of doing for(Element element:tableRows), Elements has a size method. 元素没有使用for(Element element:tableRows),而是使用size方法。

So, you should be able to just do some validation with the size, and then simply 因此,您应该能够对大小进行一些验证,然后简单地

for(int i = 0; i < 10; i++){
  tableRows.get(i);
} 

to get 10 of them. 得到十个

As for the spaces, before you store them in your arraylist just use regular expressions and remove the spaces. 至于空格,在将它们存储在arraylist中之前,只需使用正则表达式并删除空格即可。

http://www.vogella.com/articles/JavaRegularExpressions/article.html http://www.vogella.com/articles/JavaRegularExpressions/article.html

Try This 尝试这个

   import java.io.IOException;
    import java.util.ArrayList;

    import org.jsoup.Jsoup;
    import org.jsoup.nodes.Document;
    import org.jsoup.select.Elements;

    public class test
    {

         static ArrayList<String> arr_shipping=new ArrayList<String>();
     public static void main(String args[]) throws IOException
      {
         try {
            Document  doc = Jsoup.connect("https://vts.mhpa.co.uk/main_movelistb.asp").timeout(600000).get(); 
             Elements tableRows = doc.select("table.dynlist tr:not(:eq(0))");

             tableRows.size();
                     for(int i = 0; i < 10; i++){
                               //tableRows.get(i);
              String  shippingList =tableRows.get(i).text() +"\n";

                   arr_shipping.add(shippingList);// add value to ArrayList
                   System.out.println(shippingList);
                 } 
              } catch (IOException e) {
             // TODO Auto-generated catch block
             e.printStackTrace();
         }      

       //  return arr_shipping;//<< return ArrayList from here

      }

    }

Try this 尝试这个

doc.select("table.dynlist tr:lt(10)");

to limt the results. 限制结果。

Reference 参考

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM