I want to get all the IP proxy address from: https://free-proxy-list.net/
I decided that it will be faster if I get it from source code.
But problem is that I see everything when I click CTRL + U , but when I'm using "page_source" I see only few ip instead of all.
Thanks for help. For DebanjanB I show code. I didn't have to use selenium.
There is code:
import requests
import lxml.html
r = requests.get("https://free-proxy-list.net/")
html = lxml.html.fromstring(r.content)
ip_list = html.xpath("//tr/td[1]/text()")
port_list = html.xpath("//tr/td[2]/text()")
with open("E:\proxy_lista.csv",'w',newline='') as csvfile:
spamwriter = csv.writer(csvfile, delimiter=' ',quotechar='|', quoting=csv.QUOTE_MINIMAL)
for i in range(0,len(ip_list)):
spamwriter.writerow(ip_list[i].split())
csvfile.close()
This is because only 20 table rows currently displayed on page.
If you need just to scrape those IP
numbers, you might need to use python-requests
+ lxml.html
instead of selenium
:
import requests
import lxml.html
r = requests.get("https://free-proxy-list.net/")
html = lxml.html.fromstring(r.content)
ip_list = html.xpath("//tr/td[1]/text()")
If it's mandatory for you to use selenium
you should create an empty list, append()
required values and click()
"Next" button. Do this in a while
loop until "Next" button is enabled
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.