[英]Accumulo scan/write not running in standalone Java main program in AWS EC2 master using Cloudera CDH 5.8.2
我們正在嘗試使用Putty在獨立的Java主程序(Maven陰影可執行文件)中從Accumulo(客戶端jar 1.5.0)運行簡單的write / sacn,如下所示,在AWS EC2 master中(如下所述)
public class AccumuloQueryApp {
private static final Logger logger = LoggerFactory.getLogger(AccumuloQueryApp.class);
public static final String INSTANCE = "accumulo"; // miniInstance
public static final String ZOOKEEPERS = "ip-x-x-x-100:2181"; //localhost:28076
private static Connector conn;
static {
// Accumulo
Instance instance = new ZooKeeperInstance(INSTANCE, ZOOKEEPERS);
try {
conn = instance.getConnector("root", new PasswordToken("xxx"));
} catch (Exception e) {
logger.error("Connection", e);
}
}
public static void main(String[] args) throws TableNotFoundException, AccumuloException, AccumuloSecurityException, TableExistsException {
System.out.println("connection with : " + conn.whoami());
BatchWriter writer = conn.createBatchWriter("test", ofBatchWriter());
for (int i = 0; i < 10; i++) {
Mutation m1 = new Mutation(String.valueOf(i));
m1.put("personal_info", "first_name", String.valueOf(i));
m1.put("personal_info", "last_name", String.valueOf(i));
m1.put("personal_info", "phone", "983065281" + i % 2);
m1.put("personal_info", "email", String.valueOf(i));
m1.put("personal_info", "date_of_birth", String.valueOf(i));
m1.put("department_info", "id", String.valueOf(i));
m1.put("department_info", "short_name", String.valueOf(i));
m1.put("department_info", "full_name", String.valueOf(i));
m1.put("organization_info", "id", String.valueOf(i));
m1.put("organization_info", "short_name", String.valueOf(i));
m1.put("organization_info", "full_name", String.valueOf(i));
writer.addMutation(m1);
}
writer.close();
System.out.println("Writing complete ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~`");
Scanner scanner = conn.createScanner("test", new Authorizations());
System.out.println("Step 1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~`");
scanner.setRange(new Range("3", "7"));
System.out.println("Step 2 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~`");
scanner.forEach(e -> System.out.println("Key: " + e.getKey() + ", Value: " + e.getValue()));
System.out.println("Step 3 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~`");
scanner.close();
}
public static BatchWriterConfig ofBatchWriter() {
//Batch Writer Properties
final int MAX_LATENCY = 1;
final int MAX_MEMORY = 10000000;
final int MAX_WRITE_THREADS = 10;
final int TIMEOUT = 10;
BatchWriterConfig config = new BatchWriterConfig();
config.setMaxLatency(MAX_LATENCY, TimeUnit.MINUTES);
config.setMaxMemory(MAX_MEMORY);
config.setMaxWriteThreads(MAX_WRITE_THREADS);
config.setTimeout(TIMEOUT, TimeUnit.MINUTES);
return config;
}
}
連接已正確建立,但是創建BatchWriter時出現錯誤,並且嘗試以相同錯誤循環
[impl.ThriftScanner] DEBUG: Error getting transport to ip-x-x-x-100:10011 : NotServingTabletException(extent:TKeyExtent(table:21 30, endRow:21 30 3C, prevEndRow:null))
當我們在Spark作業中運行相同的代碼(寫入Accumulo並從Accumulo讀取)並將其提交給YANK集群時,它運行良好。 我們正在努力弄清這一點,但沒有任何線索。 請按照以下說明查看環境
AWS環境上的Cloudera CDH 5.8.2(4個EC2實例作為一個主實例和3個子實例)。
考慮私有IP就像
我們在CDH中有以下安裝
群集(CDH 5.8.2)
嗯,這很好奇,它可以與Spark-on-YARN一起運行,但可以作為常規Java應用程序運行。 通常,這是另一種方式:)
我將驗證獨立Java應用程序的類路徑上的JAR是否與Spark-on-YARN作業以及Accumulo服務器類路徑使用的JAR相匹配。
如果那沒有幫助,請嘗試將log4j級別提高到DEBUG或TRACE,然后查看是否有任何意外出現。 如果您很難理解日志記錄的含義,請隨時發送電子郵件至user@accumulo.apache.org,您一定會更加關注此問題。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.