Reading and Iterating through text file with multithreading in Python

Question

I'm trying to read a text file with a bunch of random wordlists and sending a GET request with pyCurl to an API while iterating through them line by line in a loop at a fast rate. My understanding of the threading is that each thread that is created is opening re-opening the text file hence there are duplicates when I decide to print the iterated words. So my question is, is it possible to iterate through the file line by line in order with multithreading? If so, how?

import threading

usernames = open('words.txt','r').read().splitlines()


def check():
    for user in usernames:
        print(user)


print('Starting')
t = []
for i in range(150):
    threads = threading.Thread(target=check)
    threads.start()
    t.append(threads)

for thread in t:
    thread.join()

I've tried different explicit methods but each have result in either the output spamming words out of order words, not iterating at properly, or the speed within of the program being slowed by nesting the iteration with the thread for loop.

Thank you.

Answer 1

Try to avoid creating 1 thread per each thing; it makes it hard to control concurrency (as well as needing more effort to distribute arguments).

Futures can be used to manage concurrency.

from concurrent.futures import ThreadPoolExecutor

You still have an executor function:

def process(item):
    print(item)

Create a pool of a fixed size, and read the file once on the main thread:

with open('somefile.txt', 'r') as infile, ThreadPoolExecutor(max_workers=10) as pool:
    pool.map(process, infile)

pool.map is using infile as an iterator in this case (text file objects are iterators over lines). You could also return a value from process and then iterate over map() 's return value.

Reading and Iterating through text file with multithreading in Python

Question

1 answers

solution1
0 2021-11-05 01:40:39

Reading and Iterating through text file with multithreading in Python

Question

1 answers

solution1 0 2021-11-05 01:40:39

solution1
0 2021-11-05 01:40:39