简体   繁体   English

如何将XML文件从HDFS加载到HBase表

[英]How to load xml file from hdfs to hbase table

I have an XML file in HDFS, I want to load these XML files into HBase table. 我在HDFS中有一个XML文件,我想将这些XML文件加载到HBase表中。

I referred some of the links, they are using map reduce option to load the XML data into HBase, is there any alternate option available to load directly into HBase table. 我提到了一些链接,它们使用map reduce选项将XML数据加载到HBase中,是否有任何替代选项可直接加载到HBase表中。

I have gave the example using input3.xml file loading using PIG into HBASE. 我已经给出了使用PIG加载到HBASE中的input3.xml文件的示例。

=== input3.xml =====
<document>   
<url>htp://www.abc.com/</url>
<category>Sports</category>
<usercount>120</usercount>
<reviews>    
<review>good site</review>
<review>This is Avg site</review>
<review>Bad site</review>
</reviews>
</document>



A = LOAD'input3.xml' using 
   org.apache.pig.piggybank.storage.XMLLoader('document').HBaseStorage as 
   (data:chararray);

 B = foreach A GENERATE FLATTEN(REGEX_EXTRACT_ALL(data,'(?s)<document>.*?<url> 
 ([^>]*?)</url>.*?<category>([^>]*?)</category>.*?<usercount>([^>]*?)</usercount>.*? 
  <reviews>.*?<review>\\s*([^>]*?)\\s*</review>.*?</reviews>.*?</document>')) as 
  (url:chararray,catergory:chararray,usercount:int,review:chararray);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM