简体   繁体   English

与kafka消费者一起去渠道

[英]go channels with kafka consumer

I'm new to go and starting to learn about channels. 我是新手,开始学习频道。 I'm using the confluent kafka consumer to create a functional consumer. 我正在使用融合的卡夫卡消费者创建一个功能消费者。 What I want to accomplish is to send the messages into a buffered channel (2,000)...and then write the messages in the channel to redis using pipeline. 我要完成的工作是将消息发送到缓冲的通道(2,000)中,然后将消息写入通道中以使用管道进行Redis。 I've gotten to consumer part to work by just doing a println of the message one by one until it reaches the end of the offsets, but when I try to add a channel, it seems to hit the default: case in the switch and then just freeze. 我已经开始使用消费者的工作方式,只是println地对消息进行println直到到达偏移量的末尾,但是当我尝试添加通道时,它似乎达到了default: switch和然后冻结。

it also doesn't look like I'm filling the channel correctly? 还好像我没有正确填充频道吗? This fmt.Println("count is: ", len(redisChnl)) always prints 0 fmt.Println("count is: ", len(redisChnl))始终打印0

here is what I have so far: 这是我到目前为止所拥有的:

// Example function-based high-level Apache Kafka consumer
package main

import (
    "fmt"
    "github.com/confluentinc/confluent-kafka-go/kafka"
    "os"
    "os/signal"
    "syscall"
    "time"
    "encoding/json"
    "regexp"
    "github.com/go-redis/redis"
    "encoding/binary"
)

var client *redis.Client

func init() {
    client = redis.NewClient(&redis.Options{
        Addr:         ":6379",
        DialTimeout:  10 * time.Second,
        ReadTimeout:  30 * time.Second,
        WriteTimeout: 30 * time.Second,
        PoolSize:     10,
        PoolTimeout:  30 * time.Second,
    })
    client.FlushDB()
}

type MessageFormat struct {
    MetricValueNumber float64     `json:"metric_value_number"`
    Path              string      `json:"path"`
    Cluster           string      `json:"cluster"`
    Timestamp         time.Time   `json:"@timestamp"`
    Version           string      `json:"@version"`
    Host              string      `json:"host"`
    MetricPath        string      `json:"metric_path"`
    Type              string      `json:"string"`
    Region            string      `json:"region"`
}

//func redis_pipeline(ky string, vl string) {
//  pipe := client.Pipeline()
//
//  exec := pipe.Set(ky, vl, time.Hour)
//
//  incr := pipe.Incr("pipeline_counter")
//  pipe.Expire("pipeline_counter", time.Hour)
//
//  // Execute
//  //
//  //     INCR pipeline_counter
//  //     EXPIRE pipeline_counts 3600
//  //
//  // using one client-server roundtrip.
//  _, err := pipe.Exec()
//  fmt.Println(incr.Val(), err)
//  // Output: 1 <nil>
//}

func main() {


    sigchan := make(chan os.Signal, 1)
    signal.Notify(sigchan, syscall.SIGINT, syscall.SIGTERM)

    c, err := kafka.NewConsumer(&kafka.ConfigMap{
        "bootstrap.servers":               "kafka.com:9093",
        "group.id":                        "testehb",
        "security.protocol":               "ssl",
        "ssl.key.location":                "/Users/key.key",
        "ssl.certificate.location":        "/Users/cert.cert",
        "ssl.ca.location":                 "/Users/ca.pem",
    })

    if err != nil {
        fmt.Fprintf(os.Stderr, "Failed to create consumer: %s\n", err)
        os.Exit(1)
    }

    fmt.Printf("Created Consumer %v\n", c)

    err = c.SubscribeTopics([]string{"jmx"}, nil)

    redisMap := make(map[string]string)

    redisChnl := make(chan []byte, 2000)

    run := true

    for run == true {
        select {
        case sig := <-sigchan:
            fmt.Printf("Caught signal %v: terminating\n", sig)
            run = false
        default:
            ev := c.Poll(100)
            if ev == nil {
                continue
            }

            switch e := ev.(type) {
            case *kafka.Message:

                //fmt.Printf("%% Message on %s:\n%s\n",
                //  e.TopicPartition, string(e.Value))
                if e.Headers != nil {
                    fmt.Printf("%% Headers: %v\n", e.Headers)
                }

                str := e.Value
                res := MessageFormat{}
                json.Unmarshal([]byte(str), &res)


                fmt.Println("size", binary.Size([]byte(str)))

                host:= regexp.MustCompile(`^([^.]+)`).FindString(res.MetricPath)

                redisMap[host] = string(str)
                fmt.Println("count is: ", len(redisChnl)) //this always prints "count is:  0"

                redisChnl <- e.Value //I think this is the write way to put the messages in the channel?

            case kafka.PartitionEOF:
                fmt.Printf("%% Reached %v\n", e)
            case kafka.Error:
                fmt.Fprintf(os.Stderr, "%% Error: %v\n", e)
                run = false
            default:
                fmt.Printf("Ignored %v\n", e)
            }

            <- redisChnl // I thought I could just empty the channel like this once the buffer is full?


        }
    }

    fmt.Printf("Closing consumer\n")
    c.Close()
}

-------EDIT------- - - - -编辑 - - - -

Ok, I think I got it to work by moving the <- redisChnl inside default , but now I see that the count before read and count after read inside the default always prints 2,000 ...I would have thought that the first count before read = 2,000 and then count after read = 0 since the channel would be empty then?? 好的,我想我可以通过在default内移动<- redisChnl ,但是现在我看到default内的count after read count before read count after readcount after read始终显示2,000 ...我本来以为count before read = 2,000的第一个count before read = 2,000 ,然后count after read = 0因为该通道将为空,然后?

    select {
    case sig := <-sigchan:
        fmt.Printf("Caught signal %v: terminating\n", sig)
        run = false
    default:
        ev := c.Poll(100)
        if ev == nil {
            continue
        }

        switch e := ev.(type) {
        case *kafka.Message:

            //fmt.Printf("%% Message on %s:\n%s\n",
            //  e.TopicPartition, string(e.Value))
            if e.Headers != nil {
                fmt.Printf("%% Headers: %v\n", e.Headers)
            }

            str := e.Value
            res := MessageFormat{}
            json.Unmarshal([]byte(str), &res)


            //fmt.Println("size", binary.Size([]byte(str)))

            host:= regexp.MustCompile(`^([^.]+)`).FindString(res.MetricPath)

            redisMap[host] = string(str)

            go func() {
                redisChnl <- e.Value
            }()


        case kafka.PartitionEOF:
            fmt.Printf("%% Reached %v\n", e)
        case kafka.Error:
            fmt.Fprintf(os.Stderr, "%% Error: %v\n", e)
            run = false
        default:
            fmt.Println("count before read: ", len(redisChnl))

            fmt.Printf("Ignored %v\n", e)

            <-redisChnl

            fmt.Println("count after read: ", len(redisChnl)) //would've expected this to be 0

        }


    }

I think the bigger way to simplify this code is to separate the pipeline into multiple goroutines. 我认为简化此代码的更大方法是将管道分成多个goroutine。

The advantage of channels is the multiple people can be writing and reading on them at the same time. 渠道的优势在于,多个人可以同时在其上书写和阅读。 In this example, this might mean having a single go routine enqueueing onto the channel and another pulling off of the channel and putting things into redis. 在此示例中,这可能意味着有一个进入例程的通道进入队列,而另一个退出该通道并将东西放入Redis。

Something like this: 像这样:

c := make(chan Message, bufferLen)
go pollKafka(c)
go pushToRedis(c)

If you want to add batching, you could add a middle stage that polls from the kafka channel, and appends to a slice until the slice is full and then enqueues that slice onto the channel for redis. 如果要添加批处理,则可以添加一个中间阶段,该中间阶段从kafka通道进行轮询,然后追加到切片,直到切片已满,然后将该切片排队到通道中以供Redis使用。

If concurrency like this isn't a goal, it might be just be easier to replace the channel in your code with a slice. 如果不是这样的并发目标,那么用切片替换代码中的通道可能会更容易。 If there is only ever 1 goroutine acting on an object, it's not a good idea to try and use a channel. 如果只有一个goroutine作用在一个对象上,那么尝试使用通道不是一个好主意。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM