使用xpath和scrapy从HTML提取特定值

Question

I have following html Code: 我有以下html代码：

 <tr data-live="COumykPG" data-dt="10,11,2017,19,00" data-def="1"> <td class="table-matches__tt"><span class="table-matches__time" data-live-cell="time">19:00</span><a href="/soccer/germany/oberliga-bremen/oberneuland-habenhauser/COumykPG/" data-live-cell="matchlink"><span>Oberneuland</span> - <span>Habenhauser</span></a></td> <td class="livebet" data-live-cell="livebet">&nbsp;</td> <td class="table-matches__streams" data-live-cell="score"> </td> <td class="table-matches__odds" data-oid="2p2k5xv464x0x6ev9v"><a href="/myselections.php?action=3&amp;matchid=COumykPG&amp;outcomeid=2p2k5xv464x0x6ev9v&amp;otheroutcomes=2p2k5xv498x0x0,2p2k5xv464x0x6eva0" onclick="return my_selections_click('1x2', 'soccer');" title="Add to My Selections" target="mySelections">1.10</a></td> <td class="table-matches__odds" data-oid="2p2k5xv498x0x0"><a href="/myselections.php?action=3&amp;matchid=COumykPG&amp;outcomeid=2p2k5xv498x0x0&amp;otheroutcomes=2p2k5xv464x0x6ev9v,2p2k5xv464x0x6eva0" onclick="return my_selections_click('1x2', 'soccer');" title="Add to My Selections" target="mySelections">7.44</a></td> <td class="table-matches__odds" data-oid="2p2k5xv464x0x6eva0"><a href="/myselections.php?action=3&amp;matchid=COumykPG&amp;outcomeid=2p2k5xv464x0x6eva0&amp;otheroutcomes=2p2k5xv464x0x6ev9v,2p2k5xv498x0x0" onclick="return my_selections_click('1x2', 'soccer');" title="Add to My Selections" target="mySelections">12.40</a></td> </tr>

I try to scrap from the following code the 3 float values: 1,10 7.44 12.40 The expression that i tried to use for geting the value was the following: 我尝试从以下代码中12.40 3个浮点值： 1,10 7.44 12.40我尝试用于获取该值的表达式如下：

response.xpath('//a/@target').extract()

Output that I get is 'mySelections' . 我得到的输出是'mySelections' 。

Iwant to get the value next to it. 想要得到它旁边的值。 What is the right expression for it? 正确的表达方式是什么？

Thank you in advance 先感谢您

Answer 1

What's wrong 怎么了

response.xpath('//a/ @target ').extract() response.xpath（'// a / @target '）.extract（）

Why? 为什么？

If you format your HTML, the error is obvious. 如果格式化HTML，则错误很明显。

You want to extract text from a tag, not the target attribute. 要提取text从a标签，而不是target的属性。

  <tr data-live="COumykPG" data-dt="10,11,2017,19,00" data-def="1"> <td class="table-matches__tt"> <span class="table-matches__time" data-live-cell="time">19:00</span> <a href="/soccer/germany/oberliga-bremen/oberneuland-habenhauser/COumykPG/" data-live-cell="matchlink"> <span>Oberneuland</span> - <span>Habenhauser</span> </a> </td> <td class="livebet" data-live-cell="livebet">&nbsp;</td> <td class="table-matches__streams" data-live-cell="score"></td> <td class="table-matches__odds" data-oid="2p2k5xv464x0x6ev9v"> <a href="/myselections.php?action=3&amp;matchid=COumykPG&amp;outcomeid=2p2k5xv464x0x6ev9v&amp;otheroutcomes=2p2k5xv498x0x0,2p2k5xv464x0x6eva0" onclick="return my_selections_click('1x2', 'soccer');" title="Add to My Selections" target="mySelections">1.10</a> </td> <td class="table-matches__odds" data-oid="2p2k5xv498x0x0"> <a href="/myselections.php?action=3&amp;matchid=COumykPG&amp;outcomeid=2p2k5xv498x0x0&amp;otheroutcomes=2p2k5xv464x0x6ev9v,2p2k5xv464x0x6eva0" onclick="return my_selections_click('1x2', 'soccer');" title="Add to My Selections" target="mySelections">7.44</a> </td> <td class="table-matches__odds" data-oid="2p2k5xv464x0x6eva0"> <a href="/myselections.php?action=3&amp;matchid=COumykPG&amp;outcomeid=2p2k5xv464x0x6eva0&amp;otheroutcomes=2p2k5xv464x0x6ev9v,2p2k5xv498x0x0" onclick="return my_selections_click('1x2', 'soccer');" title="Add to My Selections" target="mySelections">12.40</a> </td> </tr>

How to fix it 如何修复

Use one of those followings 使用以下其中一项
- response.xpath('//a/text()').extract()
- According to other developers, response.xpath sometimes will cause bugs, you should use scrapy's selector instead. 根据其他开发人员的说法， response.xpath有时会导致错误，您应该改用scrapy's selector 。
```
 from scrapy.selector import Selector result_array = Selector(text=response.body).xpath('//a/text()').extract() 
```

使用xpath和scrapy从HTML提取特定值

问题描述

1 个解决方案

解决方案1
1

What's wrong 怎么了

Why? 为什么？

How to fix it 如何修复

使用xpath和scrapy从HTML提取特定值

问题描述

1 个解决方案

解决方案1 1

What's wrong 怎么了

Why? 为什么？

How to fix it 如何修复

解决方案1
1