简体   繁体   中英

Python 3.x, need help iterating through a proxy text file line by line

I'm relatively new to python, and I am trying to build a program that can visit a website using a proxy from a list of proxies in a text file, and continue doing so with each proxy in the file until they're all used. I found some code online and tweaked it to my needs, but when I run the program, the proxies are successfully used, but they don't get used in order. For whatever reason, the first proxy gets used twice in a row, then the second proxy gets used, then the first again, then third, blah blah. It doesn't go in order one by one.

The proxies in the text file are organized as such:

123.45.67.89:8080
987.65.43.21:8080

And so on. Here's the code I am using:

from fake_useragent import UserAgent
import pyautogui
import webbrowser
import time
import random
import random
import requests
from selenium import webdriver
import os
import re

proxylisttext = 'proxylistlist.txt'
useragent = UserAgent()
profile = webdriver.FirefoxProfile()
profile.set_preference("network.proxy.type", 1)
profile.set_preference("network.proxy_type", 1)

def Visiter(proxy1):
    try:
        proxy = proxy1.split(":")
        print ('Visit using proxy :',proxy1)
        profile.set_preference("network.proxy.http", proxy[0])
        profile.set_preference("network.proxy.http_port", int(proxy[1]))
        profile.set_preference("network.proxy.ssl", proxy[0])
        profile.set_preference("network.proxy.ssl_port", int(proxy[1]))
        profile.set_preference("general.useragent.override", useragent.random)
        driver = webdriver.Firefox(firefox_profile=profile)
        driver.get('https://www.iplocation.net/find-ip-address')
        time.sleep(2)
        driver.close()
    except:
        print('Proxy failed')
        pass

def loadproxy():
    try:
        get_file = open(proxylisttext, "r+")
        proxylist = get_file.readlines()
        writeused = get_file.write('used')
        count = 0
        proxy = []
        while count < 10:
            proxy.append(proxylist[count].strip())
            count += 1
            for i in proxy:
                Visiter(i)
    except IOError:
        print ("\n[-] Error: Check your proxylist path\n")
        sys.exit(1)

def main():
    loadproxy()
if __name__ == '__main__':
    main()

And so as I said, this code successfully navigates to the ipchecker site using the proxy, but then it doesn't go line by line in order, the same proxy will get used multiple times. So I guess more specifically, how can I ensure the program iterates through the proxies one by one, without repeating? I have searched exhaustively for a solution, but I haven't been able to find one, so any help would be appreciated. Thank you.

Your problem is with these nested loops, which don't appear to be doing what you want:

    proxy = []
    while count < 10:
        proxy.append(proxylist[count].strip())
        count += 1
        for i in proxy:
            Visiter(i)

The outer loop builds up the proxy list, adding one value each time until there are ten. After each value has been added, the inner loop iterates over the proxy list that has been built so far, visiting each item.

I suspect you want to unnest the loops. That way, the for loop will only run after the while loop has completed, and so it will only visit each proxy once. Try something like this:

    proxy = []
    while count < 10:
        proxy.append(proxylist[count].strip())
        count += 1
    for i in proxy:
        Visiter(i)

You could simplify that into a single loop, if you want. For instance, using itertools.islice to handle the bounds checking, you could do:

for proxy in itertools.islice(proxylist, 10):
    Visiter(proxy.strip())

You could even run that directly on the file object (since files are iterable) rather than calling readlines first, to read it into a list. (You might then need to add a seek call on the file before writing "used" , but you may need that anyway, some OSs don't allow you to mix reads and writes without seeking in between.)

while count < 10: proxy.append(proxylist[count].strip()) count += 1 for i in proxy: Visiter(i)

The for loop within the while loop means that every time you hit proxy.append you'll call Visiter for every item already in proxy. That might explain why you're getting multiple hits per proxy.

As far as the out of order issue, I'm not sure why readlines() isn't maintaining the line order of your file but I'd try something like:

with open('filepath', 'r') as file: for line in file: do_stuff_with_line(line) With the above you don't need to hold the whole file in memory at once either which ca be nice for big files.

Good luck!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM