python 循環中的多處理

Question

我在正對的幫助下生成負對。 我想通過使用 CPU 的所有核心來加速這個過程。 在單個 CPU 內核上，包括白天和黑夜在內，幾乎需要五天時間。

我傾向於在多處理中更改以下代碼。 同時，我沒有“positives_negatives.csv”列表

if Path("positives_negatives.csv").exists():
    df = pd.read_csv("positives_negatives.csv")
else:
    for combo in tqdm(itertools.combinations(identities.values(), 2), desc="Negatives"):
        for cross_sample in itertools.product(combo[0], combo[1]):
            negatives = negatives.append(pd.Series({"file_x": cross_sample[0], "file_y": cross_sample[1]}).T,
                                         ignore_index=True)
    negatives["decision"] = "No"
    negatives = negatives.sample(positives.shape[0])
    df = pd.concat([positives, negatives]).reset_index(drop=True)
    df.to_csv("positives_negatives.csv", index=False)

修改后的代碼

def multi_func(iden, negatives):
    for combo in tqdm(itertools.combinations(iden.values(), 2), desc="Negatives"):
        for cross_sample in itertools.product(combo[0], combo[1]):
            negatives = negatives.append(pd.Series({"file_x": cross_sample[0], "file_y": cross_sample[1]}).T,
                                         ignore_index=True)

用過的

if Path("positives_negatives.csv").exists():
    df = pd.read_csv("positives_negatives.csv")
else:
    with concurrent.futures.ProcessPoolExecutor() as executor:
        secs = [5, 4, 3, 2, 1]
        results = executor.map(multi_func(identities, negatives), secs)

    negatives["decision"] = "No"
    negatives = negatives.sample(positives.shape[0])
    df = pd.concat([positives, negatives]).reset_index(drop=True)
    df.to_csv("positives_negatives.csv", index=False)

Answer 1

最好的方法是實現Process Pool Executor class 並創建一個單獨的 function。 就像你可以通過這種方式實現

圖書館

from concurrent.futures.process import ProcessPoolExecutor
import more_itertools
from os import cpu_count

def compute_cross_samples(x):
    return pd.DataFrame(itertools.product(*x), columns=["file_x", "file_y"])

修改后的代碼

if Path("positives_negatives.csv").exists():
    df = pd.read_csv("positives_negatives.csv")
else:
    with ProcessPoolExecutor() as pool:
        # take cpu_count combinations from identities.values
        for combos in tqdm(more_itertools.ichunked(itertools.combinations(identities.values(), 2), cpu_count())):
            # for each combination iterator that comes out, calculate the cross
            for cross_samples in pool.map(compute_cross_samples, combos):
                # for each product iterator "cross_samples", iterate over its values and append them to negatives
                negatives = negatives.append(cross_samples)

    negatives["decision"] = "No"

    negatives = negatives.sample(positives.shape[0])
    df = pd.concat([positives, negatives]).reset_index(drop=True)
    df.to_csv("positives_negatives.csv", index=False)

python 循環中的多處理

問題描述

1 個解決方案

解決方案1
1 已采納 2021-01-31 13:43:42

python 循環中的多處理

問題描述

1 個解決方案

解決方案1 1 已采納 2021-01-31 13:43:42

解決方案1
1 已采納 2021-01-31 13:43:42