简体   繁体   中英

BigQuery load job failing on boolean data type fields from JSON

I'm receiving an " Invalid data file" error when loading a NDJSON file into Google BigQuery when I include a boolean data type column. The job runs successfully if I remove those columns from the schema and source file. I'm using the load_table_from_uri process thru a python script, but I've also tried it in the gui and its the same issue. The json file for the boolean fields looks correct (attached). I've also tried both the legacy and current boolean data types (BOOL vrs BOOLEAN) What am I missing.

dataset_id = 'dev'  
table_id = 'DIM_EMP'  
table_ref = bqClient.dataset(dataset_id).table(table_id)

job_config = bigquery.LoadJobConfig()
job_config.schema = [
bigquery.SchemaField('personId', 'INT64'),
bigquery.SchemaField('personNumber', 'STRING'),
bigquery.SchemaField('firstName', 'STRING'),
bigquery.SchemaField('middleName', 'STRING'),
bigquery.SchemaField('lastName', 'STRING'),
bigquery.SchemaField('userName', 'STRING'),
bigquery.SchemaField('accessProfile', 'STRING'),
bigquery.SchemaField('notificationProfile', 'STRING'),
bigquery.SchemaField('preferenceProfile', 'STRING'),
bigquery.SchemaField('supervisorPersonId', 'INT64'),
bigquery.SchemaField('hireDate', 'DATE'),
bigquery.SchemaField('processEmployeeProfile', 'STRING'),
bigquery.SchemaField('logonProfile', 'STRING'),
bigquery.SchemaField('birthDate', 'DATE'),
bigquery.SchemaField('delegateProfile', 'STRING'),
**bigquery.SchemaField('isManager','BOOLEAN')
bigquery.SchemaField('isEmployee','BOOL'),**
bigquery.SchemaField('localeProfile', 'STRING')
]
job_config.write_disposition = bigquery.WriteDisposition.WRITE_TRUNCATE
job_config.source_format = bigquery.SourceFormat.NEWLINE_DELIMITED_JSON
uri = 'gs://'+project+'-stage/getPeopleDetails/DIMEMP*.ndjson'
load_job = bqClient.load_table_from_uri(
    uri,
    table_ref,
    job_config=job_config)  # API request

load_job.result()

Source JSON file

For loading NDJSON file into BigQuery, the boolean values true and false should be placed between quotation marks. If a column only contains 'true' or 'false', BigQuery will automatically set its type to Boolean.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM