简体   繁体   English

用于提取html标签的xpath

[英]xpath for extracting html tags

I want to extract cities and state from a given html which is in this form 我想从给定html格式的城市中提取城市和州

<table class="wikitable sortable">
<tr>
<th>Name of City/Town</th>
<th>Name of State</th>
<th>Classification</th>
<th>Population (2001)</th>
<th>Population (2011)</th>
</tr>
<tr>
<td><a href="/wiki/Abhayapuri" title="Abhayapuri">**Abhayapuri**</a></td>
<td><a href="/wiki/Assam" title="Assam">**Assam**</a></td>
<td>TC</td>
<td style="text-align:right;">14,673</td>
<td style="text-align:right;"></td>
</tr>

I tried doing this $x('//table/tbody/tr/td/a') 我尝试这样做$x('//table/tbody/tr/td/a')

but its returning me undesired result(ie list containing ChileNodes, children, classList, innerHTML and other metadata). 但是它返回了我不想要的结果(即包含ChileNodes,children,classList,innerHTML和其他元数据的列表)。 Dont know what I am doing wrong 不知道我在做什么错

This XPath: 这个XPath:

$x('//table/tbody/tr/td/a/text()')

will get you the city and state: 将为您提供城市和州:

["**Abhayapuri**", "**Assam**"]

This XPath will get you the city: 这个XPath将带您进入这座城市:

$x('//table/tbody/tr/td[1]/a/text()')

["**Abhayapuri**"]

And this XPath will get you the state: 这个XPath将使您获得状态:

$x('//table/tbody/tr/td[2]/a/text()')

["**Assam**"]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM