使用Python-BeautifulSoup收集表格数据

Question

Can't figure out how to scrape the first table data instead of both. 无法弄清楚如何抓取第一个表数据而不是同时抓取两个表数据。

<tr>
<td>WheelDust
</td>
<td>A large puff of barely visible brown dust
</td></tr>

I only want WheelDust but instead I get WheelDust and A large puff of barely visible brown dust 我只想要WheelDust，但我却得到WheelDust和一大团几乎看不见的棕色灰尘

import requests
from bs4 import BeautifulSoup


r = requests.get("https://wiki.garrysmod.com/page/Effects")

soup = BeautifulSoup(r.content, "html.parser")

for td in soup.findAll("table"):
    #--print(td)
    for a in td.findAll("tr"):
        print(a.text)

Answer 1

I'm still not sure what you're asking, but I believe you're saying that you want to access the and only the first , correct? 我仍然不确定您要问的是什么，但是我相信您是在说您要访问，并且只访问第一个，对吗？ If that's the case, would this not work? 如果是这样，这行不通吗？ I'd try it but it says I don't have access to the website. 我会尝试的，但是它说我无权访问该网站。

import requests
from bs4 import BeautifulSoup


r = requests.get("https://wiki.garrysmod.com/page/Effects")

soup = BeautifulSoup(r.content, "html.parser")

for td in soup.findAll("table"):
    #--print(td)
    for a in td.findAll("tr"):
        print(a.find('td'))

Answer 2

Try this as well. 也尝试一下。 It will give you all the data from that table. 它将为您提供该表中的所有数据。

import requests ; from bs4 import BeautifulSoup

soup = BeautifulSoup(requests.get("https://wiki.garrysmod.com/page/Effects").text, "html.parser")

table = soup.findAll('table', attrs={'class':'wikitable'})[0] # Changing the index number will give you whichever table you like
list_of_rows = [[t_data.text for t_data in item.findAll('td')]
                for item in table.findAll('tr')]

for data in list_of_rows:
    print(data)

使用Python-BeautifulSoup收集表格数据

问题描述

2 个解决方案

解决方案1
1 已采纳 2017-08-17 09:50:34

解决方案2
1 2017-08-18 19:21:36

使用Python-BeautifulSoup收集表格数据

问题描述

2 个解决方案

解决方案1 1 已采纳 2017-08-17 09:50:34

解决方案2 1 2017-08-18 19:21:36

解决方案1
1 已采纳 2017-08-17 09:50:34

解决方案2
1 2017-08-18 19:21:36