简体   繁体   中英

Python multithreading/multiprocessing & limiting CPU core affinity

In Python, you can create new threads and processes to run a given task with multiprocessing.Pool , multiprocessing.ThreadPool , concurrent.futures.ProcessPoolExecutor , and concurrent.futures.ThreadPoolExecutor .

By default, those threads/processes run with the same CPU core affinity as it's parent process, which is all cores/threads available.

On Linux/Unix systems, it is possible to change the CPU core affinity using os.sched_setaffinity(pid, mask) . The issue is the fact that this is limited to just some Linux/Unix systems.

There is the psutil python library that exposes the ability to set CPU core affinity with the psutil.Process().cpu_affinity(CPUS) where CPUS is a list of integers identifying which CPU cores/threads should be used by the process, starting at 0.

The issue is that generally the OS CPU scheduler can handle picking and choosing which core/thread should be utilized for a given process, rather than having an end user decide what CPU cores/threads to utilize.

The question I have is if it's possible to create the thread/process pools and limit each instance to using X number of CPU cores/threads, but not limit their exact core affinity?

For example, if I have PC with 16 cores and want to create 4 processes, I can create a multiprocessing.Pool(processes=4) object. Now if I wanted each of those 4 children to be limited to only using 2 CPU cores each, I would have to use psutil to preemptively choose 2 CPU cores and assign them to that one process, reoving those 2 CPU cores from the available list of CPU cores, and repeat the process for all 4 processes.

But this would not be ideal, as what if I gave one process the two weakest cores in the system? Or if those 2 cores were further apart physically (such has the case of modern multi-chiplet AMD Ryzen CPU's or dual CPU socket systems).

I would want to let the OS schedule 2 cores for each process automatically and juggle them as it sees fit, rather than have to manually set and unset the CPU cores for each process.

Is there a way this can be done in Python?

Some time ago I had a similar need, so I wrote a CPUResourceManager class to keep track of what cores I had assigned to a process. Here you would call the get_processors method to get a list of cores you are going to use. You set the core affinity use PSUTIL as you are already doing. When your process is done, return the cores use the free_processors method.

from typing import NamedTuple
from enum import Enum


class CPUResponse(NamedTuple):
    """
    This is the response when the CPUResourceManager is asked for some cores.
    """
    success: bool     # whether or not there are enough cores available
    processors: list  # the list of processors to be used by the process


class ProcState(Enum):
    """
    Represents the state of a processor core.  This is only 
    represents what we are having the cores do.  Not what other unrelated
    processes on the machine are doing
    """
    idle = 0
    busy = 1


class CPUResourceManager:
    def __init__(self, cpu_count=max(cpu_count() - 2, 1)) -> None:

        self.cpu_count = cpu_count

        self.processors = {i: ProcState.idle for i in range(self.cpu_count)}

    def cpu_avalaible_count(self):
        available = [
            p for p, state in self.processors.items() if state == ProcState.idle
        ]
        return len(available)

    def get_processors(self, count=1):
        """Get some available cores"""
        available = [
            p for p, state in self.processors.items() if state == ProcState.idle
        ]

        if len(available) >= count:
            cpus = available[:count]
            for p in cpus:
                self.processors[p] = ProcState.busy
            return CPUResponse(True, available[:count])
        else:
            return CPUResponse(False, [])

    def free_processors(self, processors: list):
        """return the cores when you are done"""
        for p in processors:
            if p in self.processors:
                self.processors[p] = ProcState.idle
            else:
                # manager was likely resized and this processor
                # should no longer be considered available
                pass

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM