简体   繁体   English

使用 Python 从 Web 读取表

[英]Read table from Web using Python

I'm new to Python and am working to extract data from website https://www.screener.in/company/ABB/consolidated/ on a particular table (the last table which is Shareholding Pattern )我是 Python 的新手,我正在努力从网站https://www.screener.in/company/ABB/consolidated/上的特定表(最后一个表是Shareholding Pattern )中提取数据

I'm using BeautifulSoup library for this but I do not know how to go about it.我为此使用 BeautifulSoup 库,但我不知道如何使用 go。

So far, here below is my code snippet.到目前为止,下面是我的代码片段。 am failing to pick the right table due to the fact that the page has multiple tables and all tables share common classes and IDs which makes it difficult for me to filter for the one table I want.由于页面有多个表并且所有表共享公共类和 ID,因此我无法选择正确的表,这使我很难筛选出我想要的一个表。

import requests import urllib.request
from bs4 import BeautifulSoup
    
url = "https://www.screener.in/company/ABB/consolidated/"

r = requests.get(url)
print(r.status_code)
html_content = r.text
soup = BeautifulSoup(html_content,"html.parser")
# print(soup)
#data_table = soup.find('table', class_ = "data-table")
# print(data_table) table_needed = soup.find("<h2>ShareholdingPattern</h2>")
#sub = table_needed.contents[0] print(table_needed)

Just use requests and pandas .只需使用requestspandas Grab the last table and dump it to a .csv file.抓取最后一张表并将其转储到.csv文件中。

Here's how:就是这样:

import pandas as pd
import requests

df = pd.read_html(
    requests.get("https://www.screener.in/company/ABB/consolidated/").text,
    flavor="bs4",
)
df[-1].to_csv("last_table.csv", index=False)

Output from a .csv file: Output 来自.csv文件:

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM