简体   繁体   English

使用 spark-submit 为 Spark 作业设置 HBase 属性

[英]set HBase properties for Spark Job using spark-submit

During Hbase data migration I have encountered a java.lang.IllegalArgumentException: KeyValue size too large在 Hbase 数据迁移期间,我遇到了java.lang.IllegalArgumentException: KeyValue size too large

In long term :从长远来看:

I need to increase the properties hbase.client.keyvalue.maxsize (from 1048576 to 10485760) in the /etc/hbase/conf/hbase-site.xml but I can't change this file now (I need validation).我需要增加/etc/hbase/conf/hbase-site.xml的属性hbase.client.keyvalue.maxsize (从 1048576 到 10485760),但我现在无法更改此文件(我需要验证)。

In short term :在短期内 :

I have success to import data using command :我已成功使用命令导入数据:

hbase org.apache.hadoop.hbase.mapreduce.Import \
  -Dhbase.client.keyvalue.maxsize=10485760 \
  myTable \
  myBackupFile

Now I need to run a Spark Job using spark-submit现在我需要使用 spark-submit 运行 Spark 作业

What is the better way :什么是更好的方法:

  • Prefix the HBase properties with 'spark.'使用“spark”作为 HBase 属性的前缀。 (I'm not sure it's possible and if it's works) (我不确定这是否可能,以及它是否有效)
spark-submit \
  --conf spark.hbase.client.keyvalue.maxsize=10485760
  • Using 'spark.executor.extraJavaOptions' and 'spark.driver.extraJavaOptions' to explicitly transmit HBase properties使用 'spark.executor.extraJavaOptions' 和 'spark.driver.extraJavaOptions' 显式传输 HBase 属性
spark-submit \
  --conf spark.executor.extraJavaOptions=-Dhbase.client.keyvalue.maxsize=10485760 \
  --conf spark.driver.extraJavaOptions=-Dhbase.client.keyvalue.maxsize=10485760

If you can change your code, you should be able to set these properties programmatically.如果您可以更改代码,则应该能够以编程方式设置这些属性。 I think something like this used to work for me in the past in Java:我认为过去在 Java 中这样的事情曾经对我有用:

Configuration conf = HBaseConfiguration.create();
conf.set("hbase.client.scanner.timeout.period", SCAN_TIMEOUT); // set BEFORE you create the connection object below:
Connection conn = ConnectionFactory.createConnection(conf);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM