[英]Parsing String with key value pair using Jsoup Library
I have stucked with how to parse these data in the form of key value pair.Please guide me 我一直坚持如何以键值对的形式解析这些数据。请指导我
<div class="content">
<div class="label">Company Name: </div>
Cartell Chemical Co., Ltd.
<br/>
<div class="label">Business Owner: </div>
Michael Chen
<br/>
<div class="label">Employees: </div>
210
<br/>
<div class="label">Main markets: </div>
North America, Europe, China, South Asia
<br/>
<div class="label">Business Type: </div>
Manufacturer
<br/>
</div>
I need output in these format.please guide me using Java with Jsoup library 我需要这些格式的输出。请指导我将Java与Jsoup库一起使用
Company Name:Cartell Chemical Co., Ltd.
Business Owner:Michael Chen
Employees:210
Main markets:North America, Europe, China, South Asia
Business Type:Manufacturer
Have a look at the documentation. 看一下文档。
Here's a working example: 这是一个工作示例:
public class StackOverflow20973268 {
private static String input = "<div class=\"content\">" +
"<div class=\"label\">Company Name: </div>" +
"Cartell Chemical Co., Ltd." +
"<br/>" +
"<div class=\"label\">Business Owner: </div>" +
"Michael Chen" +
"<br/>" +
"<div class=\"label\">Employees: </div>" +
"210" +
"<br/>" +
"<div class=\"label\">Main markets: </div>" +
"North America, Europe, China, South Asia" +
"<br/>" +
"<div class=\"label\">Business Type: </div>" +
"Manufacturer" +
"<br/>" +
"</div>";
public static void main(String[] args) throws IOException {
Document doc = Jsoup.parse(input);
Elements labels = doc.select("div.content div.label");
for (Element label : labels) {
System.out.println(String.format("%s:%s", label.text().trim(),
label.nextSibling().outerHtml()));
}
}
}
Output: 输出:
Company Name::Cartell Chemical Co., Ltd.
Business Owner::Michael Chen
Employees::210
Main markets::North America, Europe, China, South Asia
Business Type::Manufacturer
The Jsoup library is very good for parsing html. Jsoup库非常适合解析html。 It allows extracting values by class/id name or by tree dom traversal. 它允许按类/ id名称或按树dom遍历来提取值。 You basically get a div element and find its children which could be text nodes (containing the text to be parsed) or another element which will have its own children. 基本上,您将获得一个div元素,并找到其子元素,该子元素可以是文本节点(包含要解析的文本),也可以是另一个具有自己的子元素的元素。 Example you could do something like (not tested with some pseudo) 例如,您可以做类似的事情(未经某些伪测试)
doc = Jsoup.parse(info);
Elements divs= doc.body().getElementsByTag("div");
for (Element divElement: divs) {
//extract text of div element with div.textNodes()
//then
//div.nextNode() or something like that
}
Basically finding elements and stepping either into them for text or to the next/previous one. 基本上是查找元素,然后进入其中以查找文本或进入下一个/上一个。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.