从网页抓取表格到 python

Question

I am learning Spanish & to help me learn the different verbs and their conjugations I am making some flash cards to use on my phone.我正在学习西班牙语并帮助我学习不同的动词及其变位，我正在制作一些 flash 卡以在我的手机上使用。

I am trying to scrape the data from a web page here is example page for one verb .我正在尝试从 web 页面中抓取数据，这里是一个动词的示例页面。 On the page there are a few tables, I am interested in the first five (Present, Future, Imperfect, Preterite & Conditional) near the top.在页面上有几个表格，我对靠近顶部的前五个（现在、未来、不完美、过时和条件）感兴趣。

I have heard the BeautifulSoup is good for these types of projects.我听说 BeautifulSoup 适用于这些类型的项目。 However when I use the prettify method I can't find the tables in the text anywhere?但是，当我使用美化方法时，我在任何地方都找不到文本中的表格？ I think I'm missing something, how can I get these tables in python?我想我遗漏了一些东西，我怎样才能在 python 中获得这些表？

 import requests
 from bs4 import BeautifulSoup
 import re

 URL = 'https://www.linguasorb.com/spanish/verbs/conjugation/tener.html'
 page = requests.get(URL)
 soup = BeautifulSoup(page.content, 'html.parser')
 txt = soup.prettify()

Answer 1

You're loading the wrong url.您正在加载错误的 url。 Remove the ".html" from the URL variable and you will be able to find the tables (they're actually lists) in the output: soup.find_all('div', class_='vPos')从 URL 变量中删除“.html”，您将能够在 output 中找到表（它们实际上是列表）： soup.find_all('div', class_='vPos')

从网页抓取表格到 python

问题描述

1 个解决方案

解决方案1
1 2021-01-18 19:21:38

从网页抓取表格到 python

问题描述

1 个解决方案

解决方案1 1 2021-01-18 19:21:38

解决方案1
1 2021-01-18 19:21:38