简体   繁体   中英

Hive Table Error loading json data

json file:

{
    "DocId":"ABC",
    "User":{
        "Id":1234,
        "Username":"sam1234",
        "Name":"Sam",
        "ShippingAddress":{
            "Address1":"123 Main St.",
            "Address2":null,
            "City":"Durham",
            "State":"NC"
        },
        "Orders":[{
                "ItemId":6789,
                "OrderDate":"11/11/2012"
            },
            {
                "ItemId":4352,
                "OrderDate":"12/12/2012"
            }
        ]
    }
}}

schema:

create external table sample_json(DocId string,User struct<Id:int,Username:string,Name:string,ShippingAddress:struct<Address1:string,Address2:string,City:string,State:string>,Orders:array<struct<ItemId:int,OrderDate:string>>>)ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe' location '/user/babu/sample_json';

--loading data to the hive table

load data inpath '/user/samplejson/samplejson.json' into table sample_json;

Error:

when I am firing the select query like

select * from sample_json;

Exception:

Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.JsonParseException: Unexpected end-of-input: expected close marker for OBJECT (from [Source: java.io.StringReader@8c3770; line: 1, column: 0]) at [Source: java.io.StringReader@8c3770; line: 1, column: 3]

First please ensure that json file is valid through http://jsonlint.com and then remove any newline characters or unwanted spaces in the json file before loading the file into the hive table. Also please drop the table and create a new table if you have already loaded json files having newline characters into the table.

Following is the input you can try

{"DocId":"ABC",
 "User":{"Id":1234,
         "Username":"sam1234",
          "Name":"Sam",
          "ShippingAddress":{"Address1":"123 Main St.","Address2":null,"City":"Durham","State":"NC"},
 "Orders":[{"ItemId":6789,"OrderDate":"11/11/2012"}, 
           {"ItemId":4352,"OrderDate":"12/12/2012"}
          ]
         }
  }
  1. remove the newline from the json file.

{"DocId": "ABC", "Userdetails": {"Id": 1234, "Username": "sam1234", "Name": "Sam", "ShippingAddress": {"Address1": "123 Main St.", "Address2": null, "City": "Durham", "State": "NC" }, "Orders":[{"ItemId": 6789, "OrderDate": "11/11/2012"}, {"ItemId": 4352, "OrderDate": "12/12/2012"}]}}

  1. change User to userdetails as it's a identifier check the error which I got. 3.either use location or load data inpath. because both does the same work. Location does not create a folder in HDFS while load inpath does create folder.

在此处输入图片说明

following are the commands :

hive>

create external table sample_json(DocId string, userdetails struct< Id:int , Username:string,Name:string,ShippingAddress:struct,Orders:array< struct< ItemId:int, OrderDate:string>>>)ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' location '/user/admin'; OK Time taken: 0.13 seconds

hive>

select * from sample_json; OK sample_json.docid sample_json.userdetails ABC {"id":1234,"username":"sam1234","name":"Sam","shippingaddress":{"address1":"123 Main St.","address2":null,"city":"Durham","state":"NC"},"orders":[{"itemid":6789,"orderdate":"11/11/2012"},{"itemid":4352,"orderdate":"12/12/2012"}]} Time taken: 0.106 seconds, Fetched: 1 row(s)

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM