简体   繁体   中英

Load data via HFile into HBase not working

I wrote a mapper to load data from disk via HFile into HBase, the program runs successfully, but there's no data loaded in my HBase table, any ideas on this please?

Here's my java program:

protected void writeToHBaseViaHFile() throws Exception {
        try {
            System.out.println("In try...");
            Configuration conf = HBaseConfiguration.create();
            conf.set("hbase.zookeeper.quorum", "XXXX");
            Connection connection = ConnectionFactory.createConnection(conf);
            System.out.println("got connection");

            String inputPath = "/tmp/nuggets_from_Hive/part-00000";
            String outputPath = "/tmp/mytemp" + new Random().nextInt(1000);
            final TableName tableName = TableName.valueOf("steve1");
            System.out.println("got table steve1, outputPath = " + outputPath);

            // tag::SETUP[]
            Table table = connection.getTable(tableName);

            Job job = Job.getInstance(conf, "ConvertToHFiles");
            System.out.println("job is setup...");

            HFileOutputFormat2.configureIncrementalLoad(job, table,
                connection.getRegionLocator(tableName)); // <1>
            System.out.println("done configuring incremental load...");

            job.setInputFormatClass(TextInputFormat.class); // <2>

            job.setJarByClass(Importer.class); // <3>

            job.setMapperClass(LoadDataMapper.class); // <4>
            job.setMapOutputKeyClass(ImmutableBytesWritable.class); // <5>
            job.setMapOutputValueClass(KeyValue.class); // <6>

            FileInputFormat.setInputPaths(job, inputPath);
            HFileOutputFormat2.setOutputPath(job, new org.apache.hadoop.fs.Path(outputPath));
            System.out.println("Setup complete...");
            // end::SETUP[]

            if (!job.waitForCompletion(true)) {
                System.out.println("Failure");
            } else {
                System.out.println("Success");
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

Here's my mapper class:

public class LoadDataMapper extends Mapper<LongWritable, Text, ImmutableBytesWritable, Cell> {

    public static final byte[] FAMILY = Bytes.toBytes("pd");
    public static final byte[] COL = Bytes.toBytes("bf");
    public static final ImmutableBytesWritable rowKey = new ImmutableBytesWritable();

    @Override
    protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
        String[] line = value.toString().split("\t"); // <1>
        byte[] rowKeyBytes = Bytes.toBytes(line[0]);
        rowKey.set(rowKeyBytes);
        KeyValue kv = new KeyValue(rowKeyBytes, FAMILY, COL, Bytes.toBytes(line[1])); // <6>
        context.write (rowKey, kv); // <7>
        System.out.println("line[0] = " + line[0] + "\tline[1] = " + line[1]);
    }

}

I've created the table steve1 in my cluster, but got 0 rows after the program runs successfully:

hbase(main):007:0> count 'steve1'
0 row(s) in 0.0100 seconds

=> 0

What I've tried:

I tried to add print out message as in the mapper class to see if it really read the data, but the printouts never got printed in my console. I'm at a loss at how to debug this.

Any ideas is greatly appreciated!

This is only to create HFiles, you still need to load HFile onto your table. For example, you need to do something like:

LoadIncrementalHFiles loader = new LoadIncrementalHFiles(conf);
loader.doBulkLoad(new Path(outputPath), admin, hTable, regionLocator);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM