用Nokogiri排除HTML标签

Question

I am trying to get all the text in TD tag except what is inside <strong> tags (there might be any number of them). 我正在尝试获取TD标签中的所有文本，除了<strong>标签内的内容（可能有任意数量）。

In this example I want to get: " graavis ● diakriitik ( ) ↝ " and " acute accent`": 在此示例中，我想获取：“ graavis ● diakriitik ( ）↝ " and "急性口音`”：

<tr class="level2">
    <td> 
        <strong> grave accent </strong> 
         <strong> (=backquote character) </strong>
         graavis ● diakriitik (`) ↝ 
         <a href="?word=sv82">acute accent</a>
    </td>
</tr>

I'm trying to use the code below, but it doesn't work: 我正在尝试使用下面的代码，但是它不起作用：

desc = page.css('tr td:not(strong)').text

Answer 1

Consider: 考虑：

page.search("strong").remove
page.css(".level2 > td").text.strip

用Nokogiri排除HTML标签

问题描述

1 个解决方案

解决方案1
1 已采纳 2015-03-30 09:51:52

用Nokogiri排除HTML标签

问题描述

1 个解决方案

解决方案1 1 已采纳 2015-03-30 09:51:52

解决方案1
1 已采纳 2015-03-30 09:51:52