xpath接受所有文本，而不仅仅是第一行

Question

I have this html 我有这个HTML

    <td colspan="2" align="justify" class="inPage">
                <p>
                    2 bedroom + maids +balcony in Tiara Residence - Diamond type
                    <br>1700 sq.ft, furnished with kitchen equipment
                    <br>Sea view/ Atlantis view
                    <br>Selling Price: 4 million
                </p>
    </td>

My xpath is: 我的xpath是：

normalize-space(.//div[@class='section']/table/tr[7]/td/p/text())

The result is just 2 bedroom + maids +balcony in Tiara Residence - Diamond type 结果是2 bedroom + maids +balcony in Tiara Residence - Diamond type

I need the other text inside the p tag. 我需要p标记内的其他文本。

I am using scrapy 0.20 with python 0.27 我正在使用python 0.27的scrapy 0.20

Answer 1

You can simply use 您可以简单地使用

normalize-space(.//div[@class='section']/table/tr[7]/td/p)

but this concatenate al text nodes, without any newline characters. 但这连接了所有文本节点，没有任何换行符。

normalize-space() , as with other XPath string functions that expect a string argument, will convert the input node p to it's string-value . 与其他需要字符串参数的XPath字符串函数一样， normalize-space()会将输入节点p转换为其string-value 。 Quoting XPath 1.0 specifications : 引用XPath 1.0规范：

For every type of node, there is a way of determining a string-value for a node of that type. 对于每种类型的节点，都有一种方法可以确定该类型节点的字符串值。 For some types of node, the string-value is part of the node; 对于某些类型的节点，字符串值是该节点的一部分； for other types of node, the string-value is computed from the string-value of descendant nodes 对于其他类型的节点，从后代节点的字符串值计算出字符串值

xpath接受所有文本，而不仅仅是第一行

问题描述

I am using scrapy 0.20 with python 0.27 我正在使用python 0.27的scrapy 0.20

1 个解决方案

解决方案1
1 2014-03-18 17:56:42

xpath接受所有文本，而不仅仅是第一行

问题描述

I am using scrapy 0.20 with python 0.27 我正在使用python 0.27的scrapy 0.20

1 个解决方案

解决方案1 1 2014-03-18 17:56:42

解决方案1
1 2014-03-18 17:56:42