简体   繁体   中英

Python: How many cores are used by my python program with five processes?

I have a python program consisting of 5 processes outside of the main process. Now I'm looking to get an AWS server or something similar on which I can run the script. But how can I find out how many vCPU cores are used by the script/how many are needed? I have looked at:

import multiprocessing

multiprocessing.cpu_count()

But it seems that it just returns the CPU count that's on the system. I just need to know how many vCPU cores the script uses.

Thanks for your time.

EDIT:

Just for some more information. The Processes are running indefinitely.

Your question uses some general terms and leaves much unspecified so answers must be general.

It is assumed you are managing the processes using either Process directly or ProcessPoolExecutor.

In some cases, vCPU is a logical processor but per the following link there are services offering configurations of fractional vCPUs such as those in shared environments...

What is vCPU in AWS

You mention/ask...

... Now I'm looking to get an AWS server or something similar on which I can run the script. ...

... But how can I find out how many vCPU cores are used by the script/how many are needed? ...

You state AWS or something like it. The answer would depend on what your subprocess do, and how much of a vCPU or factional vCPU each subprocess needs. Generally, a vCPU is analogous to a logical processor upon which a thread can execute. A fractional portion of a vCPU will be some limited usage (than some otherwise "full" or complete "usage") of a vCPU.

The meaning of one or more vCPUs (or fractional vCPUs thereto) to your subprocesses really depends on those subprocesses, what they do. If one subprocess is sitting waiting on I/O most of the time, you hardly need a dedicated vCPU for it.

I recommend starting with some minimal least expensive configuration and see how it works with your app's expected workload. If you are not happy, increase the configuration as needed.

If it helps...

I usually use subprocesses if I need simultaneous execution that avoids Python's GIL limitations by breaking things into subprocesses. I generally use a single active thread per subprocess, where any other threads in the same subprocess are usually at a wait, waiting for I/O or do not otherwise compete with the primary active thread of the subprocess. Of course, a subprocess could be dedicated to I/O if you want to separate such from other threads you place in other subprocesses.

Since we do not know your app's purpose, architecture and many other factors, it's hard to say more than the generalities above.

Answer to this post probably lies in the following question:

Multiprocessing: More processes than cpu.count

In short, you have probably hundreds of processes running, but that doesn't mean you will use hundreds of cores. It all depends on utilization, and the workload of the processes.

You can also get some additional info from the psutil module

import psutil

print(psutil.cpu_percent())
print(psutil.cpu_stats())
print(psutil.cpu_freq())

or using OS to receive current cpu usage in python:

import os
import psutil

l1, l2, l3 = psutil.getloadavg()
CPU_use = (l3/os.cpu_count()) * 100

print(CPU_use)
  • Credit: DelftStack

Edit

There might be some information for you in the following medium article. Maybe there are some tools for CPU usage too. https://medium.com/survata-engineering-blog/monitoring-memory-usage-of-a-running-python-program-49f027e3d1ba

On Linux you can use the "top" command at the command line to monitor the real-time activity of all threads of a process id:

top -H -p <process id>

I'll try to do my own summary about "I just need to know how many vCPU cores the script uses" .

There is no way to answer that properly other than running your app and monitoring its resource usage. Assuming your Python processes do not spawn subprocesses (which could even be multithreaded applications), all we can say is that your app won't utilize more than 6 cores (as per total number of processes). There's a ton of ways for program to under-utilize CPU cores, like waiting for I/O (disk or network) or interprocess synchronization (shared resources). So to get any kind of understanding of CPU utilization, you really need to measure the actual performance (eg, with htop utility on Linux or macOS) and investigating the causes of underperforming (if any).

Hope it helps.

Your computer has hundreds if not thousands of processes running at any given point. How does it handle all of those if it only has 5 cores? The thing is, each core takes a process for a certain amount of time or until it has nothing left to do inside that process.

For example, if I create a script that calculates the square root of all numbers from 1 to say a billion, you will see that a single core will hit max usage, then a split second later another core hits max while the first drops to normal and so on until the calculation is done.

Or if the process waits for an I/O process, then the core has nothing to do, so it drops the process, and goes to another process, when the I/O operation is done, the core can pick the process back, and get back to work.

You can run your multiprocessing python code on a single core, or on 100 cores, you can't really do much about it. However, on windows, you can set affinity of a process, which gives the process access to certain cores only. So, when the processes start, you can go to each one and set the affinity to say core 1 or each one to a separate core. Not sure how you do that on Linux though.

In conclusion, if you want a short and direct answer, I think we can say as many cores as it has access to.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM