简体   繁体   English

如何使用 Python 将 JSON 数据转换为 Avro 格式

[英]How to convert JSON Data to Avro format using Python

I would like to convert the below JSON data to avro format, I used the below code snippet to write the JSON data in avro format but received an error.我想将下面的 JSON 数据转换为 avro 格式,我使用下面的代码片段以 avro 格式编写 JSON 数据但收到错误。 If anyone can help with this, it would be really great.如果有人可以提供帮助,那就太好了。

from fastavro import writer, reader, schema
from rec_avro import to_rec_avro_destructive, from_rec_avro_destructive, rec_avro_schema

def getweatherdata():
    url = 'https://api.openweathermap.org/data/2.5/onecall?lat=33.441792&lon=-94.037689&exclude=hourly,daily&appid=' + apikey
    response = requests.get(url)
    data = response.text
    return data
 
def turntoavro():
    avro_objects = (to_rec_avro_destructive(rec) for rec in getweatherdata())
    with open('json_in_avro.avro', 'wb') as f_out:
        writer(f_out, schema.parse_schema(rec_avro_schema()), avro_objects)



turntoavro()

    Error details:
    
      File "fastavro/_write.pyx", line 269, in fastavro._write.write_record
    TypeError: Expected dict, got str
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "datalake.py", line 30, in <module>
        turntoavro()
      File "datalake.py", line 26, in turntoavro
        writer(f_out, schema.parse_schema(rec_avro_schema()), avro_objects)
      File "fastavro/_write.pyx", line 652, in fastavro._write.writer
      File "fastavro/_write.pyx", line 605, in fastavro._write.Writer.write
      File "fastavro/_write.pyx", line 341, in fastavro._write.write_data
      File "fastavro/_write.pyx", line 278, in fastavro._write.write_record
    AttributeError: 'str' object has no attribute 'get'

Sample Data:样本数据:

    {
      "lat": 33.44,
      "lon": -94.04,
      "timezone": "America/Chicago",
      "timezone_offset": -18000

   }

To retrieve the response to the request you made, you used response.text which returns the response as a string and not in JSON format.要检索对您提出的请求的响应,您使用了response.text ,它以字符串而不是 JSON 格式返回响应。 You have to use response.json() instead to have it in JSON format:您必须使用response.json()来将其设置为 JSON 格式:

import json    
def getweatherdata():
    url = 'https://api.openweathermap.org/data/2.5/onecall?lat=33.441792&lon=-94.037689&exclude=hourly,daily&appid=' + apikey
    response = requests.get(url)
    data = response.json()
    return data
     
def turntoavro():
    avro_objects = (to_rec_avro_destructive(rec) for rec in getweatherdata())
    with open('json_in_avro.avro', 'wb') as f_out:
        writer(f_out, schema.parse_schema(rec_avro_schema()), avro_objects)
    
    
    
turntoavro()

As mentioned in one of the answers, you probably want to use response.json() rather than response.text so that you get back an actual JSON dictionary.如其中一个答案中所述,您可能希望使用response.json()而不是response.text以便您返回实际的 JSON 字典。

However, the other problem is that getweatherdata() returns a single dictionary so when you do avro_objects = (to_rec_avro_destructive(rec) for rec in getweatherdata()) you are iterating over the keys in that dictionary.但是,另一个问题是getweatherdata()返回单个字典,因此当您执行avro_objects = (to_rec_avro_destructive(rec) for rec in getweatherdata())时,您正在迭代该字典中的键。 Instead you should do avro_objects = [to_rec_avro_destructive(getweatherdata())]相反,你应该做avro_objects = [to_rec_avro_destructive(getweatherdata())]

I believe this code should work for you:我相信这段代码应该适合你:

from fastavro import writer, reader, schema
from rec_avro import to_rec_avro_destructive, from_rec_avro_destructive, rec_avro_schema

def getweatherdata():
    url = 'https://api.openweathermap.org/data/2.5/onecall?lat=33.441792&lon=-94.037689&exclude=hourly,daily&appid=' + apikey
    response = requests.get(url)
    data = response.json()
    return data
 
def turntoavro():
    avro_objects = [to_rec_avro_destructive(getweatherdata())]
    with open('json_in_avro.avro', 'wb') as f_out:
        writer(f_out, schema.parse_schema(rec_avro_schema()), avro_objects)

turntoavro()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM