简体   繁体   English

限制进程组的CPU时间

[英]Limit CPU time of process group

Is there a way to limit the absolute CPU time (in CPU seconds) spend in a process group? 有没有办法限制进程组中的绝对CPU时间(以CPU秒为单位)花费?

ulimit -t 10; ./my-process ulimit -t 10; ./my-process looks like a good option but if my-process forks then each process in the process group gets its own limit. ulimit -t 10; ./my-process看起来是一个不错的选择但是如果my-process分叉,那么进程组中的每个进程都有自己的限制。 The whole process group can use an arbitrary amount of time by forking every 9 seconds. 整个过程组可以通过每9秒分叉使用任意数量的时间。

The accepted answer on a similar question is to use cgroups but doesn't explain how. 类似问题上接受的答案是使用cgroups,但没有解释如何。 However, there are other answers ( Limit total CPU usage with cgroups ) saying that this is not possible in cgroups and only relative cpu usage can be limited (for example, 0.2 seconds out of every 1 second). 但是,还有其他答案( 限制cgroups的总CPU使用率 )说这在cgroup中是不可能的,并且只能限制相对cpu的使用(例如,每1秒0.2秒)。

Liran Funaro suggested using a long period for cpu.cfs_period_us ( https://stackoverflow.com/a/43660834/892961 ) but the parameter for the quota can be at most 1 second. Liran Funaro建议长时间使用cpu.cfs_period_ushttps://stackoverflow.com/a/43660834/892961 ),但配额的参数最多为1秒。 So even with a long period I don't see how to set a CPU time limit of 10 seconds or an hour. 因此,即使有很长一段时间,我也看不到如何设置10秒或1小时的CPU时间限制。

If ulimit and cgroups cannot do this, is there another way? 如果ulimit和cgroups不能这样做,还有另外一种方法吗?

you can do it with cgroups. 你可以用cgroups做到这一点。 Do as root: 以root身份执行:

# Create cgroup
cgcreate -g cpu:/limited

# set shares (cpu limit)
cgset -r cpu.shares=256 limited

# run your program
cgexec -g cpu:limited /my/hungry/program

Alternatively you can use the cpulimit program which can freeze your code periodically. 或者,您可以使用cpulimit程序,它可以定期冻结您的代码。 cgroups is the most advanced method though. cgroups是最先进的方法。

to set fixed cpu share : 设置固定的cpu份额:

cgcreate -g cpu:/fixedlimit
# allow fix 25% cpu usage (1 cpu)
cgset -r cpu.cfs_quota_us=25000,cpu.cfs_period_us=100000 fixedlimit
cgexec -g cpu:fixedlimit /my/hungry/program

It turned out, the goal is to limit runtime to certain seconds while measuring it. 事实证明,目标是在测量时将运行时间限制在特定秒数。 After setting the desired cgroup limits (in order to get a fair sandbox) you can achieve this goal by running: 设置所需的cgroup限制(为了获得公平的沙箱)后,您可以通过运行以下目标来实现此目标:

((time -p timeout 20 cgexec -g cpu:fixedlimit /program/to/test ) 2>&1) | grep user

After 20 seconds the program will be stopped no matter what, and we can parse for user time (or system or real time) to evaluate it's performance. 20秒后程序将停止,无论如何,我们可以解析用户时间(或系统或实时)来评估它的性能。

This not directly answer the question but refers to the discussion on the actual need of the OP. 这不是直接回答问题,而是指对OP的实际需求的讨论。

If your competition ignores everything except CPU time, it may be fundamentally flawed. 如果你的竞争对手忽略了除CPU时间之外的所有事情,那么它可能存在根本性的缺陷。 One can simply, for example, cache results in the primary storage device. 例如,可以简单地将结果缓存在主存储设备中。 Since you do not count storage access time, it may have the least CPU cycles, but the worse actual performance. 由于您不计算存储访问时间,因此它可能具有最少的CPU周期,但实际性能更差。 A perfect crime would be to simply send the data via the Internet to another computer, which calculate the task then return the answer. 一个完美的犯罪就是简单地通过互联网将数据发送到另一台计算机,计算任务然后返回答案。 This would finish the task with what appear to be zero cycles. 这将完成任务,看起来是零周期。 You actually want to measure "real" time and give this process the highest priority in your system (or actually running it secludedly). 实际上,您希望测量“实际”时间,并将此过程作为系统中的最高优先级(或者实际上以隐蔽方式运行它)。

When checking students' homework, we simply used an unrealistic time limit (eg, 5 minutes for what should be a 10 seconds program), then killing the process if it has not finished in time and failing this submission. 在检查学生的作业时,我们只是使用了一个不切实际的时间限制(例如,应该是10秒程序的5分钟),然后如果没有及时完成并且未通过此提交,则终止该过程。

If you want to pick a winner, then simply re-run the best competitors multiple times to ensure the validity of their results. 如果您想挑选一名获胜者,那么只需多次重新运行最佳竞争对手,以确保其结果的有效性。

I found a solution that works for me. 我发现了一个适合我的解决方案。 It is still far from perfect (read the caveats before using it). 它仍然远非完美(在使用之前阅读警告)。 I'm somewhat new to bash scripting so any comments about this are welcome. 我对bash脚本有些新意,所以欢迎任何关于此的评论。

#!/bin/bash
#
# This script tries to limit the CPU time of a process group similar to
# ulimit but counting the time spent in spawned processes against the
# limit. It works by creating a temporary cgroup to run the process in
# and checking on the used CPU time of that process group. Instead of
# polling in regular intervals, the monitoring process assumes that no
# time is lost to I/O (i.e., wall clock time = CPU time) and checks in
# after the time limit. It then updates its assumption by comparing the
# actual CPU usage to the time limit and waiting again. This is repeated
# until the CPU usage exceeds its limit or the monitored process
# terminates. Once the main process terminates, all remaining processes
# in the temporary cgroup are killed.
#
# NOTE: this script still has some major limitations.
# 1) The monitored process can exceed the limit by up to one second
#    since every iteration of the monitoring process takes at least that
#    long. It can exceed the limit by an additional second by ignoring
#    the SIGXCPU signal sent when hitting the (soft) limit but this is
#    configurable below.
# 2) It assumes there is only one CPU core. On a system with n cores
#    waiting for t seconds gives the process n*t seconds on the CPU.
#    This could be fixed by figuring out how many CPUs the process is
#    allowed to use (using the cpuset cgroup) and dividing the remaining
#    time by that. Since sleep has a resolution of 1 second, this would
#    still introduce an error of up to n seconds.


set -e

if [ "$#" -lt 2 ]; then
    echo "Usage: $(basename "$0") TIME_LIMIT_IN_S COMMAND [ ARG ... ]"
    exit 1
fi
TIME_LIMIT=$1
shift

# To simulate a hard time limit, set KILL_WAIT to 0. If KILL_WAIT is
# non-zero, TIME_LIMIT is the soft limit and TIME_LIMIT + KILL_WAIT is
# the hard limit.
KILL_WAIT=1

# Update as necessary. The script needs permissions to create cgroups
# in the cpuacct hierarchy in a subgroup "timelimit". To create it use:
#   sudo cgcreate -a $USER -t $USER -g cpuacct:timelimit
CGROUPS_ROOT=/sys/fs/cgroup
LOCAL_CPUACCT_GROUP=timelimit/timelimited_$$
LOCAL_CGROUP_TASKS=$CGROUPS_ROOT/cpuacct/$LOCAL_CPUACCT_GROUP/tasks

kill_monitored_cgroup() {
    SIGNAL=$1
    kill -$SIGNAL $(cat $LOCAL_CGROUP_TASKS) 2> /dev/null
}

get_cpu_usage() {
    cgget -nv -r cpuacct.usage $LOCAL_CPUACCT_GROUP
}

# Create a cgroup to measure the CPU time of the monitored process.
cgcreate -a $USER -t $USER -g cpuacct:$LOCAL_CPUACCT_GROUP


# Start the monitored process. In case it fails, we still have to clean
# up, so we disable exiting on errors.
set +e
(
    set -e
    # In case the process doesn't fork a ulimit is more exact. If the
    # process forks, the ulimit still applies to each child process.
    ulimit -t $(($TIME_LIMIT + $KILL_WAIT))
    ulimit -S -t $TIME_LIMIT
    cgexec -g cpuacct:$LOCAL_CPUACCT_GROUP --sticky $@
)&
MONITORED_PID=$!

# Start the monitoring process
(
    REMAINING_TIME=$TIME_LIMIT
    while [ "$REMAINING_TIME" -gt "0" ]; do
        # Wait $REMAINING_TIME seconds for the monitored process to
        # terminate. On a single CPU the CPU time cannot exceed the
        # wall clock time. It might be less, though. In that case, we
        # will go through the loop again.
        sleep $REMAINING_TIME
        CPU_USAGE=$(get_cpu_usage)
        REMAINING_TIME=$(($TIME_LIMIT - $CPU_USAGE / 1000000000))
    done

    # Time limit exceeded. Kill the monitored cgroup.
    if  [ "$KILL_WAIT" -gt "0" ]; then
        kill_monitored_cgroup XCPU
        sleep $KILL_WAIT
    fi
    kill_monitored_cgroup KILL
)&
MONITOR_PID=$!

# Wait for the monitored job to exit (either on its own or because it
# was killed by the monitor).
wait $MONITORED_PID
EXIT_CODE=$?

# Kill all remaining tasks in the monitored cgroup and the monitor.
kill_monitored_cgroup KILL
kill -KILL $MONITOR_PID 2> /dev/null
wait $MONITOR_PID 2>/dev/null

# Report actual CPU usage.
set -e
CPU_USAGE=$(get_cpu_usage)
echo "Total CPU usage: $(($CPU_USAGE / 1000000))ms"

# Clean up and exit with the return code of the monitored process.
cgdelete cpuacct:$LOCAL_CPUACCT_GROUP
exit $EXIT_CODE

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM