如何在for循环中限制一个特定表行中的字符（Python / BeautifulSoup）

Question

In the table I'm scraping, the 2nd row is very long and I'd like to simply limit the characters that are in it since I only want the information that is at the beginning of the string. 在我要抓取的表中，第二行很长，我只想限制其中的字符，因为我只想要字符串开头的信息。 I want to scrape the other rows as they are. 我想按原样刮其他行。 So my code is as follows: 所以我的代码如下：

table = soup.find(id="table3")
    table_rows = table.findAll('tr')

    for tr in table_rows:
        td = tr.findAll('td')
        row = [i.text.strip() for i in td]
        print(row)

How can I only target the second row? 如何只定位第二行？

The output specifically looks like: 输出具体如下：

["Computer price for Apple Inc. ,\n\n\nType\nForward\n\n\n\n\n\n\nBack\n\n\n\n\nDie\n\r\n...

So I only want to grap the Computer price for Apple Inc. part, maybe there is a better way than just using character limit as a heuristic. 因此，我只想掌握Computer price for Apple Inc.的Computer price for Apple Inc. ，也许有比将字符数限制作为启发法更好的方法。 Is it possible to specify it to grab everything before ,\\n\\n\\n 是否可以指定它来抓取,\\n\\n\\n之前的所有内容

Answer 1

You can use split function to separate text line. 您可以使用拆分功能来分隔文本行。 I have used ",\\n\\n\\n" as a separator: 我已经使用",\\n\\n\\n"作为分隔符：

>>> row = 'Computer price for Apple Inc. ,\n\n\nType\nForward\n\n\n\n\n\n\nBack\n\n\n\n\nDie\n\r\n'
>>> row.split(sep=",\n\n\n", maxsplit=1)[0]
'Computer price for Apple Inc. ,'

如何在for循环中限制一个特定表行中的字符（Python / BeautifulSoup）

问题描述

1 个解决方案

解决方案1
0 2017-09-24 23:06:08

如何在for循环中限制一个特定表行中的字符（Python / BeautifulSoup）

问题描述

1 个解决方案

解决方案1 0 2017-09-24 23:06:08

解决方案1
0 2017-09-24 23:06:08