获取包含字符串的行

Question

I'm trying to get a line from a textfile that contains a certain sequence of characters : 我正在尝试从包含某些字符序列的文本文件中获取一行：

my input : 我的输入：

    <tr><td>lucas.vlan77.be</td> <td><span style="color:green;font-weight:bold">V</span></td> <td><span style="color:green;font-weight:bold">V</span></td> <td><span style="color:green;font-weight:bold">V</span></td> </tr>
<tr><td>jeanpierre.vlan77.be</td> <td><span style="color:green;font-weight:bold">V</span></td> <td><span title="Cannot connect to 193.191.187.25:22345." style="color:red;font-weight:bold">X</span></td> <td><span title="No response from DNS at 193.191.187.25." style="color:red;font-weight:bold">X</span></td> </tr>
<tr><td>sofie.vlan77.be</td> <td><span style="color:green;font-weight:bold">V</span></td> <td><span title="Cannot connect to 193.191.187.26:22345." style="color:red;font-weight:bold">X</span></td> <td><span title="No response from DNS at 193.191.187.26." style="color:red;font-weight:bold">X</span></td> </tr>
<tr><td>thomas.vlan77.be</td> <td><span style="color:green;font-weight:bold">V</span></td> <td><span style="color:green;font-weight:bold">V</span></td> <td><span style="color:green;font-weight:bold">V</span></td> </tr>

Now I need to get the line that contains lucas, I tried this with beautifulsoup, but it is not meant to get a line only content of html tags, so I tried with a regular in operator : 现在，我需要获取包含lucas的行，我使用了beautifulsoup尝试了此操作，但这并不意味着仅获取html标签内容，所以我尝试了使用常规in运算符：

def soupParserToTable(self,input):
    global header

    soup = self.BeautifulSoup(input)
    header = soup.first('tr')
    tableInput='0'

    for line in input:
        if 'lucas' in line:
            tableInput = line
    print tableInput

However it keeps returning 0 instead of 但是它一直返回0而不是

<tr><td>lucas.vlan77.be</td> <td><span style="color:green;font-weight:bold">V</span></td> <td><span style="color:green;font-weight:bold">V</span></td> <td><span style="color:green;font-weight:bold">V</span></td> </tr>

Answer 1

If input is just a string, then for line in input doesn't iterate lines, it iterates characters. 如果input只是一个字符串，则for line in input中的行不会迭代行，而是迭代字符。 So 'lucas' would never be found in a one-character string and tableInput would not be assigned. 因此，永远不会在一个字符的字符串中找到'lucas'并且不会分配tableInput 。 The line-based iteration behaviour only happens when the object is a file. 仅当对象是文件时，才会发生基于行的迭代行为。

If you wanted to loop through each line of a string you'd have to do: 如果要遍历字符串的每一行，则必须执行以下操作：

for line in input.split('\n'):
    ...

Since you have BeautifulSoup available I'd say it would be much better to use that to read the value from the first cell in each row, rather than rely on crude and fragile string-searching. 既然您有BeautifulSoup可用，我会说，用它来读取每一行的第一个单元格的值要好得多，而不是依赖粗略而脆弱的字符串搜索。

ETA: 预计到达时间：

how I would get the table entry for the row that contains the string 'lucas' any hints ? 我如何获取包含字符串'lucas'的行的表条目的任何提示？

Use td.parent to get the containing row, td.parent.parent to get the containing table/tbody, and so on. 使用td.parent获取包含行， td.parent.parent获取包含表/ tbody，依此类推。

If you wanted to get the V or X in the next column, you could say something like: 如果要在下一列中获得V或X ，则可以这样说：

tr= soup.find(text= re.compile('lucas')).parent.parent
vorx= tr.findAll('td')[1].find('span').string

获取包含字符串的行

问题描述

1 个解决方案

解决方案1
3 已采纳 2011-09-24 08:42:44

获取包含字符串的行

问题描述

1 个解决方案

解决方案1 3 已采纳 2011-09-24 08:42:44

解决方案1
3 已采纳 2011-09-24 08:42:44