简体   繁体   English

为什么我的 multiprocessing.Pool apply_async 只在 for 循环中执行一次

[英]Why is my multiprocessing.Pool apply_async only executed once inside a for loop

I am trying to write a crawler for a web security project, and I'm having strange behaviour with a method using multiprocessing.我正在尝试为 web 安全项目编写爬虫,并且使用多处理的方法出现了奇怪的行为。

What should this method do?这个方法应该怎么做? It iterates over found target web pages, with a list of found query parameters.它使用找到的查询参数列表遍历找到的目标 web 页面。 For each web page, it should apply the method phase1 (my attack logic) to every query parameter associated with that page.对于每个 web 页面,它应该将方法phase1 (我的攻击逻辑)应用于与该页面关联的每个查询参数。

Meaning, if I have http://example.com/sub.php , having page &secret as query parameters, and http://example.com/s2.php , having topsecret as parameter, it should do the following: Meaning, if I have http://example.com/sub.php , having page &secret as query parameters, and http://example.com/s2.php , having topsecret as parameter, it should do the following:

I know if an attack is happening, based on time and output of phase1 .我知道是否发生了攻击,基于时间和 phase1 的output

What actually happens实际发生了什么

Only the first attack is executed.只执行第一次攻击。 The following calls to apply_async are ignored.以下对 apply_async 的调用将被忽略。 However, it still cycles through the loop, since it still prints the output from above for loop.但是,它仍然在循环中循环,因为它仍然从上面的 for 循环打印 output。

What is going wrong here?这里出了什么问题? Why is the attack routine not triggered?为什么没有触发攻击例程? I have looked up the docs for multiprocessing, but it doesn't help explaining this phenomenon.我查看了多处理的文档,但这无助于解释这种现象。

Some answers in related problems suggested using terminate and join, but insn't this done implicitely here, since I'm using the with statement?相关问题中的一些答案建议使用终止和加入,但是这不是在这里隐式完成的,因为我使用的是 with 语句?

Also, this question ( Multiprocessing pool 'apply_async' only seems to call function once ) sounds very similar, but is different from my problem.另外,这个问题( 多处理池'apply_async'似乎只调用一次function )听起来很相似,但与我的问题不同。 In contrary to that question, I don't have the problem that only 1 worker executes the code, but that my X workers are only spawned once (instead of Y times).与那个问题相反,我不存在只有 1 个工作人员执行代码的问题,但我的 X 个工作人员只产生了一次(而不是 Y 次)。

What I've tried: putting with..Pool outside of loops, but nothing changed我尝试过的:将 with..Pool 放在循环之外,但没有任何改变

The method in question is the following:有问题的方法如下:

def analyzeParam(siteparams, paysplit, victim2, verbose, depth, file, authcookie):
    result = {}
    subdir = parseUrl(viclist[0])
    for victim, paramlist in siteparams.items():
        sub = {}
        print("\n{0}[INFO]{1} param{4}|{2} Attacking {3}".format(color.RD, color.END + color.O, color.END, victim, color.END+color.RD))
        time.sleep(1.5)
        for param in paramlist:
            payloads = []
            nullbytes = []
            print("\n{0}[INFO]{1} param{4}|{2} Using {3}\n".format(color.RD, color.END + color.O, color.END, param, color.END+color.RD))
            time.sleep(1.5)
            with Pool(processes=processes) as pool:
                res = [pool.apply_async(phase1, args=(1,victim,victim2,param,None,"",verbose,depth,l,file,authcookie,"",)) for l in paysplit]
                for i in res:
                    #fetch results
                    tuples = i.get()
                    payloads += tuples[0]
                    nullbytes += tuples[1]
            sub[param] = (payloads, nullbytes)
            time.sleep(3)
        result[victim] = sub
    if not os.path.exists(cachedir+subdir):
        os.makedirs(cachedir+subdir)
    with open(cachedir+subdir+"spider-phase2.json", "w+") as f:
        json.dump(result, f, sort_keys=True, indent=4)
    return result

Some technical information:一些技术资料:

  • Python version: 3.8.5 Python 版本:3.8.5
  • I doubt that the bug lies in phase1 , since when called with Pool outside of a loop, but multiple times, it acts as intended.我怀疑该错误存在于phase1 ,因为当在循环外使用 Pool 调用时,但多次调用时,它会按预期运行。 If you want to look it up, the source code is here: https://github.com/VainlyStrain/Vailyn想查的话,源码在这里: https://github.com/VainlyStrain/Vailyn

How do I fix this?我该如何解决? Thanks!谢谢!

Big kudos to jasonharper for finding the issue, The issue was not the code structure above, but the variable paysplit.非常感谢 jasonharper 发现问题,问题不是上面的代码结构,而是变量 paysplit。 which was a generator and went exhausted after the first call.这是一台发电机,在第一次通话后就筋疲力尽了。

Again, thank you for pointing out!再次感谢您的指出!

Bests最好的

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 multiprocessing.Pool:何时使用 apply、apply_async 或 map? - multiprocessing.Pool: When to use apply, apply_async or map? 多处理池'apply_async'似乎只调用一次函数 - Multiprocessing pool 'apply_async' only seems to call function once 在multiprocessing中有error_callback。在Python 2中有池apply_async吗? - error_callback in multiprocessing.Pool apply_async in Python 2? 为什么在multiprocessing.Pool()。apply_async()中使用了多个工人? - why is more than one worker used in `multiprocessing.Pool().apply_async()`? multiprocessing.Pool().apply_async() 似乎没有运行我的函数 - multiprocessing.Pool().apply_async() doesn't seem to run my function 多处理池apply_async - Multiprocessing pool apply_async 如果从函数内部执行,带有“ apply_async”的多处理池不执行任何操作 - Multiprocessing pool with “apply_async” does nothing if executed from inside a function 当我从multiprocessing.Pool调用apply_async时,为什么会抛出“'module'对象没有属性XXX”错误? - Why would it throws “'module' object has no attribute XXX” error when I call on apply_async from multiprocessing.Pool? multiprocessing.Pool:使用apply_async的回调选项时调用辅助函数 - multiprocessing.Pool: calling helper functions when using apply_async's callback option 如何使用 multiprocessing.Pool 判断 apply_async 函数是否已启动或仍在队列中 - How to tell if an apply_async function has started or if it's still in the queue with multiprocessing.Pool
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM