简体   繁体   中英

How to get table schema from json file: parse_table_schema_from_json?

I am trying to get the table schema using parse_table_schema_from_json from apache_beam.io.gcp.bigquery import parse_table_schema_from_json from here

Here is my code:

 def getSchema(pathToJSON):
    with open(pathToJSON) as data_file:
        schema_data = json.dumps(json.load(data_file))
    table_schema = parse_table_schema_from_json(schema_data)
    # print(table_schema)
    return table_schema

Here is the error I get:

    Traceback (most recent call last):
  File "test_get_schema.py", line 16, in <module>
    getSchema("vauto_table_schema.json")
  File "test_get_schema.py", line 13, in getSchema
    table_schema = parse_table_schema_from_json(schema_data)
  File "/home/usr/.local/lib/python2.7/site-packages/apache_beam/io/gcp/bigquery.py", line 269, in parse_table_schema_from_json
    fields = [_parse_schema_field(f) for f in json_schema['fields']]
TypeError: list indices must be integers, not str

my json file looks like this:

[
  {
    "name": "StockNumber",
    "type": "INTEGER",
    "mode": "NULLABLE"
  },
  {
    "name": "Product",
    "type": "STRING",
    "mode": "NULLABLE"
  }
]

What am I missing?

You need dict instead of list. Modify your function and schema file like below mentioned and try again

  def getSchema(pathToJSON):
    schema_data = json.dumps(json.load(open("mapping.json")))
    table_schema = parse_table_schema_from_json(schema_data)
    # print(table_schema)
    return table_schema

your schema file should be

{
  "fields": [
   {
    "type": "INTEGER",
    "name": "StockNumber",
    "mode": "NULLABLE"
  },
  {
    "type": "STRING",
    "name": "Product",
    "mode": "NULLABLE"
  }
  ]
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM