[英]Extract text in a order using jsoup
我想提取“職位”中的文本和“摘要”類中的文本。 有許多具有相同的類名。 所以我想要第一個的職位和它的摘要。 然后是下一個職位及其摘要。 以該順序。
以下代碼有效。 但它首先給出所有標題,然后給出所有摘要類中的所有文本。 我想要第一個職位和第一個摘要。 然后是第二個職稱和第二個總結等等。 我如何為此修改代碼? 請幫忙。
<div class=" row result" id="p_64c5268586001bd2" data-jk="64c5268586001bd2" itemscope="" itemtype="http://schema.org/JobPosting" data-tn-component="organicJob">
<h2 id="jl_64c5268586001bd2" class="jobtitle">
<a rel="nofollow" href="/rc/clk?jk=64c5268586001bd2" target="_blank" onmousedown="return rclk(this,jobmap[0],0);" onclick="return rclk(this,jobmap[0],true,0);" itemprop="title" title="Fashion Assistant" class="turnstileLink" data-tn-element="jobTitle"><b>Fashion</b> Assistant</a>
</h2>
<span class="company" itemprop="hiringOrganization" itemtype="http://schema.org/Organization">
<span itemprop="name">
<a href="/cmp/Itv?from=SERP&campaignid=serp-linkcompanyname&fromjk=64c5268586001bd2&jcid=3bf3e8a57da58ff5" target="_blank">
ITV Jobs</a></span>
</span>
<a data-tn-element="reviewStars" data-tn-variant="cmplinktst2" class="turnstileLink " href="/cmp/Itv/reviews?jcid=3bf3e8a57da58ff5" title="Itv Jobs reviews" onmousedown="this.href = appendParamsOnce(this.href, '?campaignid=cmplinktst2&from=SERP&jt=Fashion+Assistant&fromjk=64c5268586001bd2');" target="_blank">
<span class="ratings"><span class="rating" style="width:49.5px;"><!-- -> </span></span><span class="slNoUnderline">28 reviews</span></a>
<span itemprop="jobLocation" itemscope="" itemtype="http://schema.org/Place"> <span class="location" itemprop="address" itemscope="" itemtype="http://schema.org/Postaladdress"><span itemprop="addressLocality">London</span></span></span>
<table cellpadding="0" cellspacing="0" border="0">
<tbody><tr>
<td class="snip">
<div>
<span class="summary" itemprop="description">
Do you have a passion for <b>Fashion</b>? You will be responsible for running our <b>fashion</b> cupboard, managing a team of interns and liaising with press officers to...</span>
</div>
doc = Jsoup.connect("http://www.indeed.co.uk/jobs?q=fashion&l=England").timeout(5000).get();
Elements f = doc.select(".jobtitle");
Elements e = doc.select(".summary");
System.out.println("Title: " + f.text());
System.out.println("Details: "+ e.text());
迭代標題,然后找到每個標題的摘要:
for (Element title : doc.select(".jobtitle")) {
Element summary = title.parent().select(".summary").first();
System.out.format("Title: %s. Summary: %s%n", title.text(), summary.text());
}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.