简体   繁体   English

行格式蜂巢中的蜂巢

[英]ROW FORMAT Serde in hive

I am using hadoop 2.0.4 and working in twitter sentiment analysis. 我正在使用hadoop 2.0.4并从事Twitter情感分析。 I have used flume to ingest data but now the twitter data must be stored in hive table. 我已经使用过flume来接收数据,但是现在twitter数据必须存储在hive表中。

I have created a table but ROW FORMAT SERDE is giving error 我已经创建了一个表,但是ROW FORMAT SERDE给了错误

'Unable to validate' “无法验证”

Kindly tell me how to proceed. 请告诉我如何进行。

Are you using a custom SerDe? 您是否正在使用自定义的SerDe?

Please refer to the below information provided in Language Manual of hive 请参考蜂巢语言手册中提供的以下信息

You can create tables with a custom SerDe or using a native SerDe. 您可以使用自定义SerDe或使用本机SerDe创建表。 A native SerDe is used if ROW FORMAT is not specified or ROW FORMAT DELIMITED is specified. 如果未指定ROW FORMAT或指定ROW FORMAT DELIMITED,则使用本机SerDe。

Hope the information is useful. 希望这些信息有用。

You can try adding this jar 您可以尝试添加此罐子

hive-serdes-1.0-SNAPSHOT.jar

After adding the jar you can create an external hive table containing the tweet_id and the tweet_text which refers to the tweets directory for performing sentiment analysis like this. 添加jar之后,您可以创建一个包含tweet_id和tweet_text的外部配置单元表,该表引用tweets目录以执行情感分析,如下所示。

create external table load_tweets(id BIGINT,text STRING) ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe' LOCATION '/user/flume/tweets'

You can refer to the below link for performing sentiment analysis using hive. 您可以参考下面的链接,使用蜂巢进行情感分析。

https://acadgild.com/blog/sentiment-analysis-on-tweets-with-apache-hive-using-afinn-dictionary/ https://acadgild.com/blog/sentiment-analysis-on-tweets-with-apache-hive-using-afinn-dictionary/

Check weather you have added hive-serdes-1.0-SNAPSHOT.jar in your hive directory under lib folder. 检查天气,您已在lib文件夹下的hive目录中添加了hive-serdes-1.0-SNAPSHOT.jar。 Your hive directory path will be the one which you have mentioned in your .bashrc file. 您的配置单元目录路径将是您在.bashrc文件中提到的路径。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM