I would like to convert the below JSON data to avro format, I used the below code snippet to write the JSON data in avro format but received an error. If anyone can help with this, it would be really great.
from fastavro import writer, reader, schema
from rec_avro import to_rec_avro_destructive, from_rec_avro_destructive, rec_avro_schema
def getweatherdata():
url = 'https://api.openweathermap.org/data/2.5/onecall?lat=33.441792&lon=-94.037689&exclude=hourly,daily&appid=' + apikey
response = requests.get(url)
data = response.text
return data
def turntoavro():
avro_objects = (to_rec_avro_destructive(rec) for rec in getweatherdata())
with open('json_in_avro.avro', 'wb') as f_out:
writer(f_out, schema.parse_schema(rec_avro_schema()), avro_objects)
turntoavro()
Error details:
File "fastavro/_write.pyx", line 269, in fastavro._write.write_record
TypeError: Expected dict, got str
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "datalake.py", line 30, in <module>
turntoavro()
File "datalake.py", line 26, in turntoavro
writer(f_out, schema.parse_schema(rec_avro_schema()), avro_objects)
File "fastavro/_write.pyx", line 652, in fastavro._write.writer
File "fastavro/_write.pyx", line 605, in fastavro._write.Writer.write
File "fastavro/_write.pyx", line 341, in fastavro._write.write_data
File "fastavro/_write.pyx", line 278, in fastavro._write.write_record
AttributeError: 'str' object has no attribute 'get'
Sample Data:
{
"lat": 33.44,
"lon": -94.04,
"timezone": "America/Chicago",
"timezone_offset": -18000
}
To retrieve the response to the request you made, you used response.text
which returns the response as a string and not in JSON format. You have to use response.json()
instead to have it in JSON format:
import json
def getweatherdata():
url = 'https://api.openweathermap.org/data/2.5/onecall?lat=33.441792&lon=-94.037689&exclude=hourly,daily&appid=' + apikey
response = requests.get(url)
data = response.json()
return data
def turntoavro():
avro_objects = (to_rec_avro_destructive(rec) for rec in getweatherdata())
with open('json_in_avro.avro', 'wb') as f_out:
writer(f_out, schema.parse_schema(rec_avro_schema()), avro_objects)
turntoavro()
As mentioned in one of the answers, you probably want to use response.json()
rather than response.text
so that you get back an actual JSON dictionary.
However, the other problem is that getweatherdata()
returns a single dictionary so when you do avro_objects = (to_rec_avro_destructive(rec) for rec in getweatherdata())
you are iterating over the keys in that dictionary. Instead you should do avro_objects = [to_rec_avro_destructive(getweatherdata())]
I believe this code should work for you:
from fastavro import writer, reader, schema
from rec_avro import to_rec_avro_destructive, from_rec_avro_destructive, rec_avro_schema
def getweatherdata():
url = 'https://api.openweathermap.org/data/2.5/onecall?lat=33.441792&lon=-94.037689&exclude=hourly,daily&appid=' + apikey
response = requests.get(url)
data = response.json()
return data
def turntoavro():
avro_objects = [to_rec_avro_destructive(getweatherdata())]
with open('json_in_avro.avro', 'wb') as f_out:
writer(f_out, schema.parse_schema(rec_avro_schema()), avro_objects)
turntoavro()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.