[英]Google sheet ImportXML fails
This one works:这个有效:
=importxml("https://discgolfmetrix.com/?u=scorecard&ID=900113&view=result", "//table[@class='data data-hover']/tr/td[2]")
This one fails:这个失败了:
=importxml("https://discgolfmetrix.com/?u=scorecard&ID=1172639&view=result", "//table[@class='data data-hover']/tr/td[2]")
If it was the other way around I could understand it, since the first one has 2 tbody tags.如果反过来我可以理解,因为第一个有 2 个 tbody 标签。
GoogleSheets parses the page in its own way (parent >> child structure is not exactly the same as in your browser). GoogleSheets 以自己的方式解析页面(父 >> 子结构与浏览器中的不完全相同)。 Use
//tr
in your XPath to circumvent parsing errors:在 XPath 中使用
//tr
来规避解析错误:
=IMPORTXML("https://discgolfmetrix.com/?u=scorecard&ID=1172639&view=result","//table[@class='data data-hover']//tr/td[2]")
Or use IMPORTHMTL
and QUERY
:或使用
IMPORTHMTL
和QUERY
:
=QUERY(IMPORTHTML("https://discgolfmetrix.com/?u=scorecard&ID=1172639&view=result","table",1),"select Col2 OFFSET 1")
EDIT
: More details: EDIT
:更多细节:
For the first link, the parsed HTML structure is the following one:对于第一个链接,解析出来的HTML结构如下:
<table>
<tr>
<td></td>
<td>your_data</td>
...
</tr>
<tr>
<td></td>
<td>your_data</td>
...
</tr>
...
</table>
And your XPath works.你的 XPath 工作正常。
For the second link, there's a preceding tbody
element which contains the tr
elements.对于第二个链接,前面的
tbody
元素包含tr
元素。 The structure is:结构是:
<table>
<tbody>
<tr>
<td></td>
<td>your_data</td>
...
</tr>
<tr>
<td></td>
<td>your_data</td>
...
</tr>
...
</tbody>
</table>
And your XPath fails.你的 XPath 失败了。 That's why you have to use
//
or declare the tbody
element in your expression:这就是为什么您必须使用
//
或在表达式中声明tbody
元素的原因:
=IMPORTXML("https://discgolfmetrix.com/?u=scorecard&ID=1172639&view=result","//table[@class='data data-hover']/tbody/tr/td[2]")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.