简体   繁体   English

使用单个类名从html标记获取文本,该html标记将包含多个类

[英]Get text from html tag using single class name, the html tag will contain multiple class

I have a html line where there are tags inside tags, a single tag my contain multiple class. 我有一个html行,其中标签内有标签,一个标签包含多个类。 I need to extract the text with single class name(i know only one class name) 我需要用单个类名提取文本(我只知道一个类名)

<p class="Body1"><span class="style3"></span><span class="style1">W</span><span class="Allsmall style5">extract this text </span><span class="style5">unwanted text </span></p>

I know the class name Allsmall alone i want to extract the text "extract this text" from the html line using Jsoup in java. 我知道一个类名Allsmall,我想在Java中使用Jsoup从html行中提取文本“ extract this text”。

You can use the selector syntax to retrieve a specific element based on its CSS class attribute: 您可以使用选择器语法根据其CSS类属性检索特定元素:

Document doc = Jsoup.parse(
  new File("input.html"), 
  "UTF-8", 
  "http://sample.com/");

Element allSmallSpan = doc.select("span.Allsmall").first(); // Retrive the first <span> element which belongs to "Allsmall" class

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM