Here,
https://www.nhc.noaa.gov/gis/
There is a table under the "Data & Products" section. I want to extract the table and save it to a CSV file. I wrote this basic code:
from bs4 import BeautifulSoup
import requests
page = requests.get("https://www.nhc.noaa.gov/gis/")
soup = BeautifulSoup(page.content, 'html.parser')
print(soup)
I only know the basics of scraping . Please, guide me from here. Thanks!
You can use pandas
import pandas as pd
url = 'https://www.nhc.noaa.gov/gis/'
df = pd.read_html(url)[0]
# create csv file
df.to_csv("mycsv.csv")
It is hard to know but i guess that this is what you want:
from bs4 import BeautifulSoup
import requests
r = requests.get('https://www.nhc.noaa.gov/gis/')
soup = BeautifulSoup(r.content, 'html.parser')
for a in soup.find_all('a'):
if a.get('href'):
if '.' in a.get('href').split('/')[-1]\
and 'html' not in a.get('href')\
and '.php' not in a.get('href')\
and 'http' not in a.get('href')\
and 'mailto' not in a.get('href'):
print('https://www.nhc.noaa.gov' + a.get('href'))
prints:
https://www.nhc.noaa.gov/gis/examples/al112017_5day_020.zip
https://www.nhc.noaa.gov/gis/examples/AL112017_020adv_CONE.kmz
https://www.nhc.noaa.gov/gis/examples/AL112017_020adv_TRACK.kmz
https://www.nhc.noaa.gov/gis/examples/AL112017_020adv_WW.kmz
https://www.nhc.noaa.govforecast/archive/al092020_5day_latest.zip
https://www.nhc.noaa.gov/storm_graphics/api/AL092020_CONE_latest.kmz
https://www.nhc.noaa.gov/storm_graphics/api/AL092020_TRACK_latest.kmz
https://www.nhc.noaa.gov/storm_graphics/api/AL092020_WW_latest.kmz
https://www.nhc.noaa.govforecast/archive/al102020_5day_latest.zip
https://www.nhc.noaa.gov/storm_graphics/api/AL102020_CONE_latest.kmz
https://www.nhc.noaa.gov/storm_graphics/api/AL102020_TRACK_latest.kmz
https://www.nhc.noaa.gov/storm_graphics/api/AL102020_WW_latest.kmz
https://www.nhc.noaa.gov/gis/examples/al112017_fcst_020.zip
https://www.nhc.noaa.gov/gis/examples/AL112017_initialradii_020adv.kmz
https://www.nhc.noaa.gov/gis/examples/AL112017_forecastradii_020adv.kmz
https://www.nhc.noaa.govforecast/archive/al092020_fcst_latest.zip
https://www.nhc.noaa.gov/storm_graphics/api/AL092020_initialradii_latest.kmz
https://www.nhc.noaa.gov/storm_graphics/api/AL092020_forecastradii_latest.kmz
https://www.nhc.noaa.govforecast/archive/al102020_fcst_latest.zip
.. and so on...
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.