简体   繁体   中英

Author parsing with JSOUP

There is my html input:

  <!-- left panel --> 
  <div class="left-panel"> 
    <p class="article-published"> 1. júl 2015 o 17:35 &nbsp;&nbsp; Marek Hudec, Dávid Tvrdoň </p>
  </div>

and the code:

if(doc.select("p[class=article-published]").isEmpty() == FALSE){
    Elements description = doc.select("p[class=article-published]");
    for (Element link : description) {
        author4 = link.text();
    }
    System.out.println("AUTHORS :" + author4);
 }

I would like to get the output, like: Marek Hudec, Dávid Tvrdoň. So only the name of those guys. But yet I can not got it. Someone help me please. Thank you

All you have to do is to parse the Text you get from Jsoup and chop the data you want from it, in below code i modified your code to get the data from specific index.

 import java.util.Arrays;
 import org.jsoup.Jsoup;
 import org.jsoup.nodes.Document;
 import org.jsoup.nodes.Element;
 import org.jsoup.select.Elements;

 public class KolosParsor {   
        public static void main(String[] args) {
            String author4 = null;
            Document doc = Jsoup.parse("<div class=\"left-panel\">"+ 
             "<p class=\"article-published\"> 1. júl 2015 o 17:35 &nbsp;&nbsp; Marek Hudec,Dávid Tvrdoň </p>");
            if(!doc.select("p[class=article-published]").isEmpty()){
                Elements description = doc.select("p[class=article-published]");
                for (Element link : description) {
                     author4 = link.text();
                 }
                 System.out.println("DATA :" + Arrays.asList(author4.split(" ")));
                 System.out.println("AUTHORS :" + Arrays.asList(author4.split(" ")).get(7));
             }          
        }
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM