简体   繁体   中英

Golang: why runtime.GOMAXPROCS is limited to 256?

I was playing with golang 1.7.3 on MacBook and Ubuntu and found that runtime.GOMAXPROCS is limited to 256. Does anyone know where this limit comes from? Is this documented anywhere and why would there be a limit? Is this an implementation optimization?

Only reference to 256 I could find is on this page that describes golang's runtime package: https://golang.org/pkg/runtime/ . The runtime.MemStats struct has a couple of stat arrays of size 256:

type MemStats struct {
    ...
    PauseNs       [256]uint64 // circular buffer of recent GC pause durations, most recent at [(NumGC+255)%256]
    PauseEnd      [256]uint64 // circular buffer of recent GC pause end times

Here's example golang code I used:

func main() {
    runtime.GOMAXPROCS(1000)
log.Printf("GOMAXPROCS %d\n", runtime.GOMAXPROCS(-1))

}

Prints GOMAXPROCS 256

PS Also, can someone point me to documentation on how this GOMAXPROCS relate to OS thread count used by golang scheduler (if at all). Shall we observe go-compiled code running GOMAXPROCS OS threads?

EDIT: Thanks @twotwotwo for pointing out how GOMAXPROCS relate to OS threads. Still it's interesting that documentation does not mention this 256 limit (other that in the MemStats struct which may or may not be related).

I wonder if anyone is aware of the true reason for this 256 number.

The package runtime docs clarify how GOMAXPROCS relates to OS threads:

The GOMAXPROCS variable limits the number of operating system threads that can execute user-level Go code simultaneously. There is no limit to the number of threads that can be blocked in system calls on behalf of Go code; those do not count against the GOMAXPROCS limit. This package's GOMAXPROCS function queries and changes the limit.

So you could see more than GOMAXPROCS OS threads (because some are blocked in system calls, and there's no limit to how many), or fewer (because GOMAXPROCS is only documented to limit the number of threads, not prescribe it exactly).

I think capping GOMAXPROCS is consistent with the spirit of that documentation--you specified you were OK with 1000 OS threads running Go code, but the runtime decided to 'only' run 256. That doesn't limit the number of goroutines active because they're multiplexed onto OS threads--when one goroutine blocks (waiting for a network read to complete, say) Go's internal scheduler starts other work on the same OS thread.

The Go team might have made this choice to minimize the chance that Go programs end up running many times more OS threads than most machines today have cores; that would cause more OS context switches, which can be slower than user-mode goroutine switches that would occur if GOMAXPROCS were kept down to the number of CPU cores present. Or it might just have been convenient for the design Go's internal scheduler to have an upper bound on GOMAXPROCS.

Goroutines vs Threads is not perfect, eg goroutines don't have segmented stacks now, but it may help you understand what's going on here under the hood.

Note that, starting the next Go 1.10 (Q1 2018), GOMAXPROCS will be limited by ... nothing.

The runtime no longer artificially limits GOMAXPROCS (previously it was limited to 1024).

See commit ee55000 by Austin Clements ( aclements ) , which fixes issue 15131 .

Now that allp is dynamically allocated, there's no need for a hard cap on GOMAXPROCS .


allp is defined here .

See also commit e900e27 :

runtime : clean up loops over allp

allp now has length gomaxprocs , which means none of allp[i] are nil or in state _Pdead .
This lets replace several different styles of loops over allp with normal range loops.

for i := 0; i < gomaxprocs; i++ { ... } for i := 0; i < gomaxprocs; i++ { ... } loops can simply range over allp .
Likewise, range loops over allp[:gomaxprocs] can just range over allp .

Loops that check for p == nil || p.state == _Pdead p == nil || p.state == _Pdead don't need to check this any more.

Loops that check for p == nil don't have to check this if dead Ps don't affect them. I checked that all such loops are, in fact, unaffected by dead Ps. One loop was potentially affected, which this fixes by zeroing p.gcAssistTime in procresize .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM