简体   繁体   中英

golang syscall, locked to thread

I am attempting to create an program to scrape xml files. I'm experimenting with go because of it's goroutines. I have several thousand files, so some type of multiprocessing is almost a necessity...

I got a program to successfully run, and convert xml to csv(as a test, not quite the end result), on a test set of files, but when run with the full set of files, it gives this:

runtime: program exceeds 10000-thread limit

I've been looking for similar problems, and theres a couple, but i haven't found one that was similar enough to solve this.

and finally heres some code im running:

// main func (start threads)

for i := range filelist {
  channels = append(channels, make(chan Test))
  go Parse(files[i], channels[len(channels)-1])
}

// Parse func (individual threads)

func Parse(fileName string, c chan Test) {
defer close(c)

doc := etree.NewDocument()
if err := doc.ReadFromFile(fileName); err != nil {
    return
}

root := doc.SelectElement("trc:TestResultsCollection")

for _, test := range root.FindElements("//trc:TestResults/tr:ResultSet/tr:TestGroup/tr:Test") {
    var outcome Test
    outcome.StepType = test.FindElement("./tr:Extension/ts:TSStepProperties/ts:StepType").Text()
    outcome.Result = test.FindElement("./tr:Outcome").Attr[0].Value
    for _, attr := range test.Attr {
        if attr.Key == "name" {
            outcome.Name = attr.Value
        }
    }

    for _, attr := range test.FindElement("./tr:TestResult/tr:TestData/c:Datum").Attr {
        if attr.Key == "value" {
            outcome.Value = attr.Value
        }
    }

    c <- outcome
}

}

// main (process results when threads return)

for c := 0; c < len(channels); c++ {
    for i := range channels[c] {
        // csv processing with i
    }
}

I'm sure theres some ugly code in there. I've just picked up go recently from other languages...so i apologize in advance. anyhow

any ideas?

I apologize for not including the correct error. as the comments pointed out i was doing something dumb and creating a routine for every file. Thanks to JimB for correcting me, and torek for providing a solution and this link. https://gobyexample.com/worker-pools

jobs := make(chan string, numJobs)
results := make(chan []Test, numJobs)

for w := 0; w < numWorkers; w++ {
    go Worker(w, jobs, results)
    wg.Add(1)
}

// give workers jobs

for _, i := range files {
    if filepath.Ext(i) == ".xml" {
        jobs <- ("Path to files" + i)
    }
}

close(jobs)
wg.Wait()

//result processing <- results

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM