简体   繁体   中英

How to get a text from a tag that do not have ID or Class

I want to extract "Movie" text from this snippet using JSOUP : 在此处输入图片说明

As you can notice, the second span tag does not have ID or class neither, besides the first span. My question is how can I retrieve that text ?

Thank you.

<span>                                                             
</span><span><span class="contentTitle">
Program Type:</span>
<span style="font-size: 14px;">
Movie</span>
<br />
</span><span id="MainContent_trProgramCategories"><span class="contentTitle">
 Categories:</span>&nbsp; 
<span style="font-size: 14px;">Horror, Thriller
</span>

尝试这个

Element element = doc.select("#MainContent_trProgramCategories  .contentTitle").get(0).nextElementSibling();

You need to keep whittling down the data by playing with the select(...) method. For instance, simply doing:

Elements myEles = doc.select("div[id=MainContent_UpdatePanel2] td");
String text = myEles.text();

System.out.println(text);

Will get you most of the stuff you're likely interested in.

You can use what "Hovercraft Full Of Eels" suggested.

For future use cases though, the easiest way to get the CSS path or XPath for an element is to use Firebug extension.

Firebug扩展

You can click the "mouse pointer looking icon" next to the "bug looking image" and choose the element that you want to retrieve the value from the browser and then the next row's XPath/CSS text box will give you the path that you can use.

Simply copy that text and paste it in the code

doc.select("HERE PASTE THE XPATH/CSS PATH THAT YOU COPIED FROM FIREBUG").text();

If you are using chrome,

you can

  1. right-click on an element that you want to retrieve the text value from
  2. choose "Inspect Element"
  3. right-click again on the highlighted element in the debugger
  4. choose "Copy XPath"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM