简体   繁体   English

jsoup select 不使用整个 html?

[英]jsoup select doesn't use the whole html?

What is my fault?我的错是什么?

Android code:安卓代码:

ArrayList<String> plan_table = new ArrayList<>();
Element table = doc.select("table").get(1); //First Table: Untis Banner and School Data (Adress, etc.); Second Table: Plan -> So load second plan (index 1)
Elements rows = table.select("tr");
Log.i("SchollgymPlanThread","This are the rows: "+rows.toString());

for (int i = 1; i < rows.size(); i++) { //first row is the col names so skip it
   Element row = rows.get(i);
   Elements cols = row.select("td");
   //Log.i("SchollgymPlanThread", cols.get(0).text());
   plan_table.add(cols.get(0).text());
   if (Pattern.matches("^Klasse .*",cols.get(0).text())) {PlanParsed.put(cols.get(0).text(), new LinkedHashMap<String,List>()); current_class=cols.get(0).text();continue;}
            if (current_class != null) {
                List<String> tmpList = new ArrayList<String>();
                for (int i2 = 1; i2 < cols.size(); i2++) {
                    if (i2 == 2) {continue;} //If Lessons Hour , continue -> Lesson our will be put as key and not in the list
                    tmpList.add(cols.get(i2).text());
                }
                Log.i("SchollgymPlanThread", tmpList.toString());
                if (cols.size() < 2) {continue;}
                PlanParsed.get(current_class).put(cols.get(2).text(), tmpList); //ParsedPlan[current_class] = {lesson_hour:lesson_attributes}
            }

            //if ( row.className() == "list odd" ) {Log.i("SchollgymPlanThread","This is a class: "+cols.get(0).text());}
            //if (cols.get(7).text().equals("down")) {
            //    plan_table.add(cols.get(5).text());
            //}

I didn't insert the whole java code but this is where i get my problem... At Line 4 it prints out the html code with the td's and tr's but it stops suddenly.我没有插入整个 java 代码,但这就是我遇到问题的地方......在第 4 行,它打印出带有 td 和 tr 的 html 代码,但它突然停止了。 The last line of the output is:输出的最后一行是:

<td cla

Is there anything wrong?有什么不对的吗? I already checked the source website...我已经查看了源网站...

How do you read in the html with Jsoup?你是如何用 Jsoup 读取 html 的? I ask, because you may hit the size limit of the loaded document.我问,因为您可能会达到加载文档的大小限制。 Jsoup limits to 1M, if not told otherwise via the maxBodySize() method. Jsoup 限制为 1M,如果没有通过maxBodySize()方法另行通知 So you may want to do this:所以你可能想要这样做:

Document doc = Jsoup.connect("YOUR_URL").maxBodySize(0).get(); 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM