简体   繁体   中英

Simple python script, multithreaded?

I have a simple python script that reads from two .txt files and saves an output on a new text file.

This is what my code looks like:

import os

with open("users.txt", encoding="utf-8") as f:
    users = f.readlines()
users = [x.strip() for x in users] 

with open("users1.txt", encoding="utf-8") as f:
    passes = f.readlines()
passes = [x.strip() for x in passes]

results = []

for user in users:
    h = user[user.find(':') + 1:]
    for p in passes:
        if p[:p.find(':')] == h:
            results.append((user[:user.find(':')], p[p.find(':') + 1:]))
            passes.remove(p)
            break

with open('results.txt', 'w+', encoding="utf-8") as f:
    for item in results:
        f.write(f"{item[0]}:{item[1]}\n")

The user.txt has about 3-10M lines.. Which of course will take ages to process, im talking 1-3 hours. I thought the only reason of this is because the script is running in a single thread. I did some digging around and im out of ideas. Is there any way to make this simple script run on more than 1 thread? Will it speed up the process?

Thanks!

It would save you a lot of time if you first converted passes into a dictionary of some sort. I don't completely understand your code, but it seems that each user and each pass has some sort of id or key embedded in it, and you're looking to find matching ids.

Convert passes into a dictionary

my_dictionary = { pass[:pass.find(':')] : pass for pass in passes } 

Your outer loop is then:

for user in users:
    id = user[user.find(':') + 1:]
    if id in my_dictionary:
        pass = my_dictionary[id]
        .... whatever you do with user and pass ...
        del my_dictionary[id]

A single dictionary lookup instead of a nested loop.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM