简体   繁体   中英

Python Web Scraping - Save data in CSV

I am trying to save the data scraped from URLs such as " https://www.holidify.com/places/shimla/mall-road-shimla-sightseeing-3502.html ". When saving the data in a csv file only the data from last url from the range gets saved in the csv file. I need that data from all of the URLs gets saved in the csv file.

pages = []
for i in range(1, 10, 1):
    url = "https://www.holidify.com/places/shimla/mall-road-shimla-sightseeing-350" + str(i) + '.html'
    pages.append(url)
    for item in pages:
        page = requests.get(item)
        soup = BeautifulSoup(page.text, 'html.parser')
        Place = list(soup.find(class_="col-md-10 col-xs-10 nopadding"))[1].get_text()
        City = list(soup.find_all(class_="smallerText"))[1].get_text()
        State = list(soup.find_all(class_="smallerText"))[2].get_text()
        Country = list(soup.find_all(class_="smallerText"))[3].get_text()
        About = list(soup.find_all(class_="biggerTextOverview"))[0].get_text()
        more_About = list(soup.find_all(class_="objHeading smallerText"))[0].get_text()
        Weather = soup.find(class_="currentWeather").get_text()
        demo = pd.DataFrame({ "Place": Place, "City": City, "State": State, "Country": Country, "About": About,"More About Places": more_About}, index=[0])
        demo.to_csv('demo.csv', index=False, encoding='utf-8')

您需要将数据追加到该文件中

demo.to_csv('demo.csv', index=False, encoding='utf-8', mode = 'a')

如@Umair所建议,将数据追加到数据帧中,并将命令demo.to_csv('demo.csv',index = False,encoding ='utf-8')置于循环外部。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM