[英]Read Avro file with python to create a SQL table
我正在尝试从包含我的表结构的AVRO文件创建一个SQL表:
{
"type" : "record",
"name" : "warranty",
"doc" : "Schema generated by Kite",
"fields" : [ {
"name" : "id",
"type" : "long",
"doc" : "Type inferred from '1'"
}, {
"name" : "train_id",
"type" : "long",
"doc" : "Type inferred from '21691'"
}, {
"name" : "siemens_nr",
"type" : "string",
"doc" : "Type inferred from 'Loco-001'"
}, {
"name" : "uic_nr",
"type" : "long",
"doc" : "Type inferred from '193901'"
}, {
"name" : "Configuration",
"type" : "string",
"doc" : "Type inferred from 'ZP28'"
}, {
"name" : "Warranty_Status",
"type" : "string",
"doc" : "Type inferred from 'Out_of_Warranty'"
}, {
"name" : "Warranty_Data_Type",
"type" : "string",
"doc" : "Type inferred from 'Real_based_on_preliminary_acceptance_date'"
}, {
"name" : "of_progression",
"type" : "long",
"doc" : "Type inferred from '100'"
}, {
"name" : "Delivery_Date",
"type" : "string",
"doc" : "Type inferred from '18/12/2009'"
}, {
"name" : "Warranty_on_Delivery_Date",
"type" : "string",
"doc" : "Type inferred from '18/12/2013'"
}, {
"name" : "Customer_Status",
"type" : "string",
"doc" : "Type inferred from 'homologation'"
}, {
"name" : "Commissioning_Date",
"type" : "string",
"doc" : "Type inferred from '6/10/2010'"
}, {
"name" : "Preliminary_acceptance_date",
"type" : "string",
"doc" : "Type inferred from '6/01/2011'"
}, {
"name" : "Warranty_Start_Date",
"type" : "string",
"doc" : "Type inferred from '6/01/2011'"
}, {
"name" : "Warranty_End_Date",
"type" : "string",
"doc" : "Type inferred from '6/01/2013'"
}, {
"name" : "Effective_End_Warranty_Date",
"type" : [ "null", "string" ],
"doc" : "Type inferred from 'null'",
"default" : null
}, {
"name" : "Level_2_in_function",
"type" : "string",
"doc" : "Type inferred from '17/07/2015'"
}, {
"name" : "Baseline",
"type" : "string",
"doc" : "Type inferred from '2.10.23.4'"
}, {
"name" : "TC_report",
"type" : "string",
"doc" : "Type inferred from 'A480140'"
}, {
"name" : "Last_version_Date",
"type" : "string",
"doc" : "Type inferred from 'A-23/09/2015'"
} ]
}
做这项工作,我正在使用(如果您有其他更简单的主张,那将很棒)
所以使用python我会得到这样的结果:
{'name':'id',type':'long','doc':'blablabla'}
我的问题是如何根据此结果在python中创建SQL表?
谢谢您的帮助
使用json模块,您可以从字符串中获取字典,然后就有一系列字段定义。 您遍历该数组以生成SQL语句。
注意:您将需要一些机制来将avro字段类型映射到SQL字段类型,尤其是当您具有"type" : [ "null", "string" ]
。
这是一个基于您的字符串构建SQL CREATE TABLE语句的代码的工作示例:
import json
schema_str = """{
"type" : "record",
"name" : "warranty",
"doc" : "Schema generated by Kite",
"fields" : [ {
"name" : "id",
"type" : "long",
"doc" : "Type inferred from '1'"
}, {
"name" : "train_id",
"type" : "long",
"doc" : "Type inferred from '21691'"
}, {
"name" : "siemens_nr",
"type" : "string",
"doc" : "Type inferred from 'Loco-001'"
} ]
}"""
schema = json.loads(schema_str)
fields = schema['fields']
sql_string = 'CREATE TABLE ' + schema['name'] + ' ( \n'
for field in fields :
sql_string = sql_string + field['name'] + ' ' + field['type'] + ', \n'
sql_string = sql_string[:-3] + '\n)' # get rid of last comma and close the field list
print sql_string
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.