简体   繁体   中英

Get daily news feed from Wikinews

want to get the title of news every 4 hours from news feed and store it in DB.

Code I tried

from bs4 import BeautifulSoup, Tag
import random
import re

url="https://en.wikinews.org/wiki/Main_Page"
reqs = requests.get(url)
print(response.status_code)

soup = BeautifulSoup(reqs.text, 'html.parser')
for title in soup.find(id='MainPage_latest_news_text').findAll('title'):
    print(title.get_text())

I successfully get the newsfeed from google and store it but want to do the same for the Wikinews.

googlenews = GoogleNews(start=Start_date,end=End_date)
googlenews.set_lang('en')
googlenews.set_encode('utf-8')
googlenews.get_news('Business')
googlenews.total_count()
result=googlenews.result()
df=pd.DataFrame(result)

you can do it by using selenium.

from selenium import webdriver

##path to your chrome driver##

driver = webdriver.Chrome(executable_path="drivers/chromedriver.exe") driver.get("https://en.wikinews.org/wiki/Main_Page")

results=[]

for i in range(1,4):

##this code is under for loop indent##

heading=driver.find_element_by_xpath("/html/body/div[3]/div[3]/div[5]/div[1]/table/tbody/tr[3]/td["+str(i)+"]/table/tbody/tr/td/span/a").text

results.append(heading)

##outsidee for loop indent ##

print(results)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM