I have been trying to get this web scraping script working properly, and am not sure what to try next. Hoping someone here knows what I should do.
I am using BS4 and the problem is whenever a URL takes a long time to load it skips over that URL (leaving an output file with fewer inputs in times of high page load times). I have been trying to add on a timer so that it only skips over the url if it doesn't load in x seconds.
Can anyone point me in the right direction?
Thanks!
尝试使用多线程或多处理来生成线程,我认为它将为每个请求生成一个线程,并且如果花费的时间太长,它也不会跳过URL。
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.