简体   繁体   中英

Hive JSON SerDe — ClassCastException: java.lang.Integer cannot be cast to java.lang.Double

I'm trying to use Hive JSON SerDe to put Twitter JSON into Hive tables. I first import the JSON into one table defined by ROW FORMAT SERDE and then import it to another table stored as an RCFile. It works up to a point but then I get a ClassCastException of the following nature:

java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row [Error getting row data with exception java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Double
    at org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaDoubleObjectInspector.get(JavaDoubleObjectInspector.java:40)
    at org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:259)
    at org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:307)
    at org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:354)
    at org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:354)
    at org.apache.hadoop.hive.serde2.SerDeUtils.buildJSONString(SerDeUtils.java:354)
    at org.apache.hadoop.hive.serde2.SerDeUtils.getJSONString(SerDeUtils.java:220)
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:667)
    at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:141)
    at org.apache.hadoop

Here's the schema I'm using to define the SerDe table:

CREATE EXTERNAL TABLE gh_raw (
   coordinates struct <
      coordinates: array <double>,
      type: string>,
   created_at string,
   entities struct <
      hashtags: array <struct <text: string>>,
      media: array <struct <
            display_url: string,
            expanded_url: string,
            media_url: string,
            media_url_https: string,
            sizes: struct <
               large: struct <
                  h: int,
                  resize: string,
                  w: int>,
               medium: struct <
                  h: int,
                  resize: string,
                  w: int>,
               small: struct <
                  h: int,
                  resize: string,
                  w: int>,
               thumb: struct <
                  h: int,
                  resize: string,
                  w: int>>,
            type: string,
            url: string>>,
      urls: array <struct <
            display_url: string,
            expanded_url: string,
            url: string>>,
      user_mentions: array <struct <
            id: int,
            name: string,
            screen_name: string>>>,
   geo struct <
      coordinates: array <double>,
      type: string>,
   id_str string,
   in_reply_to_screen_name string,
   in_reply_to_status_id_str string,
   in_reply_to_user_id_str string,
   place struct <
      attributes: struct <
         locality: string,
         region: string,
         street_address: string>,
      bounding_box: struct <
         coordinates: array <array <array <double>>>,
         type: string>,
      country: string,
      country_code: string,
      full_name: string,
      name: string,
      place_type: string,
      url: string>,
   possibly_sensitive boolean,
   retweeted_status struct <
      coordinates: struct <
         coordinates: array <double>,
         type: string>,
      created_at: string,
      entities: struct <
         hashtags: array <struct <
               text: string>>,
         media: array <struct <
               display_url: string,
               expanded_url: string,
               media_url: string,
               media_url_https: string,
               sizes: struct <
                  large: struct <
                     h: int,
                     resize: string,
                     w: int>,
                  medium: struct <
                     h: int,
                     resize: string,
                     w: int>,
                  small: struct <
                     h: int,
                     resize: string,
                     w: int>,
                  thumb: struct <
                     h: int,
                     resize: string,
                     w: int>>,
               type: string,
               url: string>>,
         urls: array <struct <
               display_url: string,
               expanded_url: string,
               url: string>>,
         user_mentions: array <struct <
               id: int,
               name: string,
               screen_name: string>>>,
      favorited: boolean,
      geo: struct <
         coordinates: array <double>,
         type: string>,
      id_str: string,
      in_reply_to_screen_name: string,
      in_reply_to_status_id_str: string,
      in_reply_to_user_id_str: string,
      place: struct <
         attributes: struct <
         locality: string,
         region: string,
         street_address: string
         >,
         bounding_box: struct <
            coordinates: array <array <array <double>>>,
            type: string>,
         country: string,
         country_code: string,
         full_name: string,
         name: string,
         place_type: string,
         url: string>,
      possibly_sensitive: boolean,
      scopes: struct <
         followers: boolean>,
      source: string,
      text: string,
      truncated: boolean,
      user: struct <
         contributors_enabled: boolean,
         created_at: string,
         default_profile: boolean,
         default_profile_image: boolean,
         description: string,
         favourites_count: int,
         followers_count: int,
         friends_count: int,
         geo_enabled: boolean,
         id: int,
         id_str: string,
         is_translator: boolean,
         lang: string,
         listed_count: int,
         `location`: string,
         name: string,
         profile_background_color: string,
         profile_background_image_url: string,
         profile_background_image_url_https: string,
         profile_background_tile: boolean,
         profile_banner_url: string,
         profile_image_url: string,
         profile_image_url_https: string,
         profile_link_color: string,
         profile_sidebar_border_color: string,
         profile_sidebar_fill_color: string,
         profile_text_color: string,
         profile_use_background_image: boolean,
         protected: boolean,
         screen_name: string,
         statuses_count: int,
         time_zone: string,
         url: string,
         utc_offset: int,
         verified: boolean>>,
   source string,
   text string,
   truncated boolean,
   user struct <
      contributors_enabled: boolean,
      created_at: string,
      default_profile: boolean,
      default_profile_image: boolean,
      description: string,
      favourites_count: int,
      followers_count: int,
      friends_count: int,
      geo_enabled: boolean,
      id: int,
      id_str: string,
      is_translator: boolean,
      lang: string,
      listed_count: int,
      `location`: string,
      name: string,
      profile_background_color: string,
      profile_background_image_url: string,
      profile_background_image_url_https: string,
      profile_background_tile: boolean,
      profile_banner_url: string,
      profile_image_url: string,
      profile_image_url_https: string,
      profile_link_color: string,
      profile_sidebar_border_color: string,
      profile_sidebar_fill_color: string,
      profile_text_color: string,
      profile_use_background_image: boolean,
      protected: boolean,
      screen_name: string,
      statuses_count: int,
      time_zone: string,
      url: string,
      utc_offset: int,
      verified: boolean>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION '/user/ahanna/gh_raw';

I figure this is crashing when finding a set of coordinates or a bounding box.

I think this is a bug with the JSON SerDe I'm using but I'm not sure. I've compiled the one I'm using from scratch, from someone who said they have fixed this issue, but no go: https://github.com/brndnmtthws/Hive-JSON-Serde

Try this SerDe - https://github.com/rcongiu/Hive-JSON-Serde . I was getting the same Exception while trying to read coordinates from tweets. Using this fixed it for me!

The binary's available here so you don't need to build it - http://www.congiu.net/hive-json-serde/

Try bigint instead of int. It works for me.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM