简体   繁体   中英

How do I export html table to csv file using python?

I scraped a html table from yahoofinance website and tried to export the table to csv file. However, it does not return the correct output in the csv file. The printed output on my terminal appears to be just fine. What have I done wrong here?

import requests
from bs4 import BeautifulSoup
import csv
import pandas as pd

mystocks = ["XOM", "CVX", "COP", "EOG"]
stockdata = []

def getData(symbol): 
    headers = {"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0"}
    url = f"https://finance.yahoo.com/quote/{symbol}/key-statistics"
    soup = BeautifulSoup(requests.get(url, headers=headers).content, "html.parser")
    print("Ticker - "+symbol)
    for t in soup.select("table"):
        for tr in t.select("tr:has(td)"):
            for sup in tr.select("sup"):
                sup.extract()
            stockdata = [td.get_text(strip=True) for td in tr.select("td")]
            if len(stockdata) == 2:
                print("{:<50} {}".format(*stockdata))

for item in mystocks:
    stockdata.append(getData(item))

    df = pd.DataFrame(stockdata)
    df.to_csv('file_name.csv')

You are printing, not returning the data. If you want all the data in one table it is good to add a column with the symbol for which the row was originated. You could use something like this

import requests
from bs4 import BeautifulSoup
import csv
import pandas as pd

mystocks = ["XOM", "CVX", "COP", "EOG"]
stockdata = []

def getData(symbol): 
    headers = {"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0"}
    url = f"https://finance.yahoo.com/quote/{symbol}/key-statistics"
    soup = BeautifulSoup(requests.get(url, headers=headers).content, "html.parser")
    print("Ticker - "+symbol)
    for t in soup.select("table"):
        for tr in t.select("tr:has(td)"):
            for sup in tr.select("sup"):
                sup.extract()
            stockdata = [td.get_text(strip=True) for td in tr.select("td")]
            if len(stockdata) == 2:
                # add a column with the symbol to help affterwards
                yield [item] + stockdata

# this will concatenate the rows for all the symbols in mystocks
df = pd.DataFrame([r for item in mystocks for r in getData(item)])
df.to_csv('file_name.csv')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM