简体   繁体   English

带有 python 的 Kafka:如何将主题发送到 postgreSQL?

[英]Kafka with python: How to send topic to postgreSQL?

I am urged to use Kafka with python.我被敦促将 Kafka 与 python 一起使用。 Moreover, I need to develop a very simple producer-consumer application that reads metrics from a device in real-time and then publishes them to a topic 'metrics' in Kafka.此外,我需要开发一个非常简单的生产者-消费者应用程序,它可以实时从设备读取指标,然后将它们发布到 Kafka 中的主题“指标”。 Then a consumer must subscribe to the 'metrics' topic and store those data to a postgreSQL database.然后,消费者必须订阅“metrics”主题并将这些数据存储到 postgreSQL 数据库中。

I tried to draw the architecture here:我试图在这里绘制架构:

           +-----------+        Fetch metrics every 1 second          +--------------+                                           
           |Biometric  |     {heartrate, oxygen level, temprature}     |              |                                           
           |generation ------------------------------------------------  producer.py |                                           
           |device     |                                              |              |                                           
           +-----------+                                              +-------|------+                                           
                                                                              |                                                  
                                                                              |                                                  
                                                                              |                                                  
                                                                              |Publish metrics in "metrics" topic, every 1 second
                                                                              |{heartrate, oxygen level, tempature}              
                                                                              |         JSON format                              
                                                                              |                                                  
                                                                              |                                                  
                                                                      +-------|------+                                           
                                                                      |              |                                           
                                                                      |    KAFKA     |                                           
                                                                      |              |                                           
                                                                      +-------|------+                                           
                                                                              |                                                  
                                                                              |                                                  
                                                                              |                                                  
                                                                              |                                                  
                                                                              | Subscribe to "metrics" topic and fetch           
                 -                                                            | the JSON every 1 second                          
                                                                              |                                                  
          +-------------+                                              +------|------+                                           
          |             |            Send data to postgreSQL           |             |                                           
          | postgreSQL  ------------------------------------------------ consumer.py |                                           
          |             |                                              |             |                                           
          +-------------+                                              +-------------+                                           

Now, this is how I (with zero Kafka experience) have imagined this app.现在,这就是我(零 Kafka 经验)想象这个应用程序的方式。 I managed to get everything to the consumer.我设法将所有东西都交给了消费者。

It is very easy for me now to connect to a postgreSQL database and send those data to it.我现在很容易连接到 postgreSQL 数据库并将这些数据发送给它。 But I am confused.但我很困惑。 I read everywhere that the connection to a such database must occur through a Kafka Connector (?).我到处读到必须通过 Kafka 连接器(?)连接到这样的数据库。 Is it wrong to just send the data I receive in the consumer to postgres manually?仅将我在消费者中收到的数据手动发送到 postgres 是错误的吗? Why would I use a 'Kafka connector' here?为什么我要在这里使用“Kafka 连接器”? At last, I am not aware of any python kafka connectors, which complicates this even more for me.最后,我不知道有任何 python kafka 连接器,这对我来说更加复杂。

Could someone help me clear things up?有人可以帮我清理一下吗?

If you want to push data to kafka in JSON format I recently wrote a simple example over here .如果你想以 JSON 格式将数据推送到 kafka,我最近在这里写了一个简单的例子。

You can also find the kafka python docs您还可以找到卡夫卡 python 文档

For the Kafka -> PostgreSQL connection, you might want to use Kafka Connect JDBC sink.对于 Kafka -> PostgreSQL 连接,您可能需要使用Kafka Connect JDBC 接收器。 Kafka Connect is a series of pre-built connector that allows you to push or pull (source or sink in kafka connect terms) data from Kafka by just writing a config file, without having to code or re-invent the wheel over and over again. Kafka Connect 是一系列预构建的连接器,允许您通过编写配置文件从 Kafka 推送或拉取(在 kafka 连接术语中为源或接收器)数据,而无需一遍又一遍地编码或重新发明轮子. Kafka connect is NOT language dependant, since all you need is to deploy it in your Kafka environment and set correctly the config file. Kafka connect 不依赖于语言,因为您只需将其部署在您的 Kafka 环境中并正确设置配置文件。

Just pay attention, if you're planning to use Kafka connect to push data to PostgreSQL, you might need either请注意,如果您打算使用 Kafka 连接将数据推送到 PostgreSQL,您可能需要

  • to create the source stream in AVRO format以 AVRO 格式创建源 stream
  • to add the schema specification to your JSON message (more info here将架构规范添加到您的 JSON 消息(更多信息在这里

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM