简体繁体中英

requests.get(url) in python behaving differently when used in loop

原文 2020-04-15 05:52:16 6 1 python/ web-scraping/ xml-parsing/ python-requests/ html-parsing

I'm new in python programming and trying to scrape every link available in my Urls.txt file. the code I wrote is:

import requests
from bs4 import BeautifulSoup
from fake_useragent import UserAgent
user_agent = UserAgent()
fp = open("Urls.txt", "r")
values = fp.readlines()
fin = open("soup.html", "a")
for link in values:
    print( link )
    page = requests.get(link, headers={"user-agent": user_agent.chrome})
    html = page.content
    soup = BeautifulSoup(html, "html.parser")
    fin.write(str(soup))

The code works absolutely fine when the links are provided directly as string instead of as variable but when used as it is the output differs.

1 answers

Maybe the string you read from the file has a line break. To remove it use link.strip("\n")

Python: requests.get, iterating url in a loop

requests.get() behaving erratically?

Python requests.get() url with multiple "."

python requests.get with invalid URL

Python requests.get() loop returns nothing

Python requests.get()

Python Requests: requests.get(url).json() Error

Python requests: requests.get returns 404 on valid url

Python requests.get(URL) returns 404 error when using URL with dot

python loop requests.get() only returns first loop

暂无

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Python: requests.get, iterating url in a loop requests.get() behaving erratically? Python requests.get() url with multiple "." python requests.get with invalid URL Python requests.get() loop returns nothing Python requests.get() Python Requests: requests.get(url).json() Error Python requests: requests.get returns 404 on valid url Python requests.get(URL) returns 404 error when using URL with dot python loop requests.get() only returns first loop

Related Tags

粤ICP备18138465号 © 2020-2024 STACKOOM.COM