简体   繁体   English

Go 当 goroutine 数量增加时程序变慢

[英]Go program slowing down when increasing number of goroutines

I'm doing a small project for my parallelism course and I have tried it with buffered channels, unbuffered channels, without channels using pointers to slices etc. Also, tried to optimize it as much as possible (not the current state) but I still get the same result: increasing number of goroutines (even by 1) slows down the whole program.我正在为我的并行课程做一个小项目,我已经尝试过使用缓冲通道、无缓冲通道、不使用切片指针的通道等。此外,尝试尽可能地优化它(不是当前状态),但我仍然得到相同的结果:增加 goroutines 的数量(即使增加 1 个)也会减慢整个程序的速度。 Can someone please tell me what I'm doing wrong and is even parallelism enhancement possible in this situation?有人可以告诉我我做错了什么,在这种情况下甚至可以增强并行性吗?

Here is part of the code:这是代码的一部分:

func main() {

    rand.Seed(time.Now().UnixMicro())

    numAgents := 2

    fmt.Println("Please pick a number of goroutines: ")
    fmt.Scanf("%d", &numAgents)

    numFiles := 4
    fmt.Println("How many files do you want?")
    fmt.Scanf("%d", &numFiles)
    start := time.Now()

    numAssist := numFiles
    channel := make(chan []File, numAgents)
    files := make([]File, 0)

    for i := 0; i < numAgents; i++ {
        if i == numAgents-1 {
            go generateFiles(numAssist, channel)
        } else {
            go generateFiles(numFiles/numAgents, channel)
            numAssist -= numFiles / numAgents
        }
    }

    for i := 0; i < numAgents; i++ {
        files = append(files, <-channel...)
    }

    elapsed := time.Since(start)
    fmt.Printf("Function took %s\n", elapsed)
}
func generateFiles(numFiles int, channel chan []File) {
    magicNumbersMap := getMap()
    files := make([]File, 0)

    for i := 0; i < numFiles; i++ {
        content := randElementFromMap(&magicNumbersMap)

        length := rand.Intn(400) + 100
        hexSlice := getHex()

        for j := 0; j < length; j++ {
            content = content + hexSlice[rand.Intn(len(hexSlice))]
        }

        hash := getSHA1Hash([]byte(content))

        file := File{
            content: content,
            hash:    hash,
        }

        files = append(files, file)
    }

    channel <- files

}

Expectation was that by increasing goroutines the program would run faster but to a certain number of goroutines and at that point by increasing goroutines I would get the same execution time or a little bit slower.预期是通过增加 goroutines 程序会运行得更快,但是对于一定数量的 goroutines 并且在这一点上通过增加 goroutines 我会得到相同的执行时间或稍微慢一点。

EDIT: All the functions that are used:编辑:所有使用的功能:

    import (
    "crypto/sha1"
    "encoding/base64"
    "fmt"
    "math/rand"
    "time"
)

type File struct {
    content string
    hash    string
}

func getMap() map[string]string {
    return map[string]string{
        "D4C3B2A1": "Libcap file format",
        "EDABEEDB": "RedHat Package Manager (RPM) package",
        "4C5A4950": "lzip compressed file",
    }
}

func getHex() []string {
    return []string{
        "0", "1", "2", "3", "4", "5",
        "6", "7", "8", "9", "A", "B",
        "C", "D", "E", "F",
    }
}

func randElementFromMap(m *map[string]string) string {
    x := rand.Intn(len(*m))
    for k := range *m {
        if x == 0 {
            return k
        }
        x--
    }
    return "Error"
}

func getSHA1Hash(content []byte) string {
    h := sha1.New()
    h.Write(content)
    return base64.URLEncoding.EncodeToString(h.Sum(nil))
}

Simply speaking - the files generation code is not complex enough to justify parallel execution.简而言之——文件生成代码不够复杂,不足以证明并行执行是合理的。 All the context switching and moving data through the channel eats all benefit of parallel processing.通过通道的所有上下文切换和移动数据都吃掉了并行处理的所有好处。

If you add something like time.Sleep(time.Millisecond * 10) inside the loop in your generateFiles function as if it was doing something more complex, you'll see what you expected to see - more goroutines work faster.如果您在generateFiles function 的循环内添加类似time.Sleep(time.Millisecond * 10)的内容,就好像它在做更复杂的事情一样,您会看到您期望看到的结果 - 更多 goroutine 工作得更快。 But again, only until certain level, when extra work to do parallel processing overweights the benefit.但同样,只有在一定程度上,进行并行处理的额外工作才会超过收益。

Note also, the execution time of the last bit of your program:另请注意,程序最后一位的执行时间:

for i := 0; i < numAgents; i++ {
    files = append(files, <-channel...)
}

directly depends on number of goroutines.直接取决于 goroutines 的数量。 Since all goroutines finish approximately at the same time, this loop almost never executed in parallel with your workers and the time it takes to run is simply added to the total time.由于所有 goroutines 几乎同时完成,这个循环几乎不会与您的 workers 并行执行,并且运行所需的时间只是简单地添加到总时间中。

Next, when you append to files slice multiple times, it has to grow several times and copy the data over to the new location.接下来,当您 append 对files切片多次时,它必须增长几次并将数据复制到新位置。 You can avoid this by initially creating a slice that will fil all your resulting elements (luckily, you know exactly how many you'll need).您可以通过最初创建一个将填充所有结果元素的切片来避免这种情况(幸运的是,您确切地知道需要多少)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM