简体   繁体   中英

KafkaProducer - Getting error to connect to kafka (Failed to update metadata after 60.0 secs)

I am trying to read data from Oracle and send to a Kafka topic. I was able to read from oracle, put it into a dataframe and I put all parameters about Kafka as I show in my code below, but I am getting the error: kafka.errors.KafkaTimeoutError: KafkaTimeoutError: Failed to update metadata after 60.0 secs.

This link look similar, but did not help me KafkaTimeoutError: Failed to update metadata after 60.0 secs

I use Amazon Managed Streaming for Apache Kafka (MSK). I have two Brokers. Do I need put both as my Bootstrap servers or just the main Bootstrap servers?

It connect to kafka and disconnect but don't send any message to kafka.

Here is my code ...

    try:
        conn = OracleHook(oracle_conn_id=oracle_conn_id).get_conn()
        query = "Select * from sales"
        df = pd.read_sql(query, conn)

        topic = 'my-topic'
        producer = KafkaProducer(bootstrap_servers=['localhost:9092'],value_serializer=lambda x:dumps(x).encode('utf-8'), api_version=(0, 10, 1)
                                 )
        for raw in pd.read_sql(query, conn):
            producer.send(topic, raw.encode('utf-8'))

        print('Number os records')

        conn.close()

    except Exception as error:
        raise error
    return

... and the log doubt KafkaProducer - Getting error to connect to kafka {{conn.py:381}} INFO - <BrokerConnection node_id=bootstrap-0 host='my-bootstrap_servers': connecting to 'my-server'] {{conn.py:410}} INFO - <BrokerConnection node_id=bootstrap-0 host='my-bootstrap_servers': Connection complete. {{conn.py:1096}} ERROR - <BrokerConnection node_id=bootstrap-0 host='my-bootstrap_servers': socket disconnected {{conn.py:919}} INFO - <BrokerConnection node_id=bootstrap-0 host='my-bootstrap_servers': Closing connection. KafkaConnectionError: socket disconnected

{{taskinstance.py:1703}} ERROR - Task failed with exception
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1332, in _run_raw_task
    self._execute_task_with_callbacks(context)
  File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1458, in _execute_task_with_callbacks
    result = self._execute_task(context, self.task)
  File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1514, in _execute_task
    result = execute_callable(context=context)
  File "/usr/local/lib/python3.7/site-packages/airflow/operators/python.py", line 151, in execute
    return_value = self.execute_callable()
  File "/usr/local/lib/python3.7/site-packages/airflow/operators/python.py", line 162, in execute_callable
    return self.python_callable(*self.op_args, **self.op_kwargs)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/kafka/producer/kafka.py", line 576, in send
    self._wait_on_metadata(topic, self.config['max_block_ms'] / 1000.0)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/kafka/producer/kafka.py", line 703, in _wait_on_metadata
    "Failed to update metadata after %.1f secs." % (max_wait,))
kafka.errors.KafkaTimeoutError: KafkaTimeoutError: Failed to update metadata after 60.0 secs.
{{taskinstance.py:1280}} INFO - Marking task as FAILED. dag_id=bkbne_ora_to_kafka, task_id=task_id, execution_date=20220624T204102, start_date=20220628T171225, end_date=20220628T171327
{{standard_task_runner.py:91}} ERROR - Failed to execute job 95 for task task_id
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/airflow/task/task_runner/standard_task_runner.py", line 85, in _start_by_fork
    args.func(args, dag=self.dag)
  File "/usr/local/lib/python3.7/site-packages/airflow/cli/cli_parser.py", line 48, in command
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/airflow/utils/cli.py", line 92, in wrapper
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/airflow/cli/commands/task_command.py", line 292, in task_run
    _run_task_by_selected_method(args, dag, ti)
  File "/usr/local/lib/python3.7/site-packages/airflow/cli/commands/task_command.py", line 107, in _run_task_by_selected_method
    _run_raw_task(args, ti)
  File "/usr/local/lib/python3.7/site-packages/airflow/cli/commands/task_command.py", line 184, in _run_raw_task
    error_file=args.error_file,
  File "/usr/local/lib/python3.7/site-packages/airflow/utils/session.py", line 70, in wrapper
    return func(*args, session=session, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1332, in _run_raw_task
    self._execute_task_with_callbacks(context)
  File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1458, in _execute_task_with_callbacks
    result = self._execute_task(context, self.task)
  File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1514, in _execute_task
    result = execute_callable(context=context)
  File "/usr/local/lib/python3.7/site-packages/airflow/operators/python.py", line 151, in execute
    return_value = self.execute_callable()
  File "/usr/local/lib/python3.7/site-packages/airflow/operators/python.py", line 162, in execute_callable
    return self.python_callable(*self.op_args, **self.op_kwargs)
  File "/usr/local/airflow/dags/send_to_kafka/src/send_to_kafka.py", line 63, in f_se
    raise e
  File "/usr/local/airflow/dags/send_to_kafka/src/send_to_kafka.py", line 55, in send_to_kafka
    producer.send(topic, row.encode('utf-8'))
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/kafka/producer/kafka.py", line 576, in send
    self._wait_on_metadata(topic, self.config['max_block_ms'] / 1000.0)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/kafka/producer/kafka.py", line 703, in _wait_on_metadata
    "Failed to update metadata after %.1f secs." % (max_wait,))
kafka.errors.KafkaTimeoutError: KafkaTimeoutError: Failed to update metadata after 60.0 secs.

Someone could help me with this? I don't know what is happen here

Ensure that you actually have the connectivity to upstream Kafka brokers (preferably every one of them) with something like ping/ncat/kafka console tools. The fact you can't get metadata (have socket disconnects) points to network "problems" (bad config / firewall?).

Do I need put both as my Bootstrap servers or just the main Bootstrap servers?

Need? No.

However the more servers you put into bootstrap, the more tolerant to failures your application is (at least in Java client, where it picks a random one to first to connect to - C (Python) one should be the same AFAICT).

Your code isn't running on the actual brokers, so bootstrap_servers=['localhost:9092'] should be changed to the address(es) that MSK provides you. You may also need to add authentication settings, depending on which port you use, and have configured your cluster.

Regarding the logic of your code, I'd suggest using MSK Connect with JDBC Source or Debezium to read a database table into Kafka.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM