Thank you in advance for your time. The code is supposed to connect to the website, and scrape the OS model from the line that has a word that is inputted by the user. It will search for the word, go to that line, and scrape the OS attribute on that line for that word. I don't see as to why my code is not working, and would appreciate some help please.
Here is the website http://www.tabletpccomparison.net/
Here is the code:
import java.io.IOException;
import java.util.Iterator;
import java.util.Scanner;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class ExtraPart1 {
public static void main(String args[]) throws IOException{
Scanner input = new Scanner(System.in);
String word = "";
System.out.println("Type in what you are trying to search for.");
word = input.nextLine();
System.out.println("This program will find a quality from a website for it");
String URL = "http://www.tabletpccomparison.net/";
Document doc = Jsoup.connect(URL).get();
Elements elements = doc.select("a");
for(Element e : elements){
if(e.equals(word)){
String next_word = e.getElementsByClass("tableJX2ope_sis").text();
System.out.print(next_word);
}
}
}
}
The problem lies here:
if(e.equals(word)){
String next_word = e.getElementsByClass("tableJX2ope_sis").text();
System.out.print(next_word);
}
e
is an Element
and it is compared to a String
. Try this instead:
if(e.text().equals(word)) {
// ...
}
You may simplify the for loop like this:
String cssQuery = String.format("a:containsOwn(%s)", word);
Elements elements = doc.select(cssQuery);
for(Element e : elements){
String nextWord = e.getElementsByClass("tableJX2ope_sis").text();
System.out.print(nextWord);
}
Your CSS selector should target the links directly in the table
you are trying to scrape. By selecting on only a
you will have to iterate every link on the document.
String selector = String.format(
"table.tableJX tr:contains(%s) > td.tableJX2ope_sis > span.field", word);
for (Element os : doc.select(selector))
System.out.println(os.ownText());
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.