简体   繁体   English

sync.Map 或使用 goroutines 时的通道

[英]sync.Map or channels when using a goroutines

I'm writing program that parses a lot of files looking for "interesting" lines.我正在编写程序来解析大量寻找“有趣”行的文件。 Then it's checking if these lines were seen before.然后它正在检查这些线是否以前见过。 Each file is parsed using separate goroutine.每个文件都使用单独的 goroutine 进行解析。 I'm wondering which approach is better:我想知道哪种方法更好:

  1. Use sync.Map or something similar使用 sync.Map 或类似的东西
  2. Use channels and separate goroutine which should be responsible only for uniqueness check (probably using standard map).使用只负责唯一性检查的通道和单独的 goroutine(可能使用标准映射)。 It would receive request and respond with something simple like "Not unique" or "Unique (and added)"它会接收请求并以“不唯一”或“唯一(并添加)”之类的简单内容进行响应

Is any of these solutions more popular or maybe both are wrong?这些解决方案中的任何一个更受欢迎还是两者都错了?

If you would like to have workers which can access a global map for unique checking, you can use a sync.RWMutex to be sure that the map is protected, like:如果您想让工作人员可以访问全局 map 进行唯一检查,您可以使用sync.RWMutex确保 map 受到保护,例如:

var (
  mutex sync.RWMutex = sync.RWMutex{}
  alreadySeen map[string]struct{} = make(map[string]struct{})
)

func Work() {
  for {
    Processing lines here...
    //Checking 
    mutex.RLock() //Lock for reading only
    if _, found := alreadySeen[line]; !found {
       mutex.RUnLock()
       mutex.Lock()
       alreadySeen[line] = struct{}{}
       mutex.UnLock()
    } else {
       mutex.RUnLock()
    }
  }
}

Another approach is to use a concurrent safe map to skip the whole mutexing, for example this package: https://github.com/cornelk/hashmap另一种方法是使用并发安全 map 来跳过整个互斥,例如这个 package: https://github.com/cornelk/hashmap

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM