简体   繁体   中英

HBase filter data based on column values

I want to filter the Hbase table scan based on list of values from a particular column.

Ex: For the table Employee given below, I want to fetch records for the employees with ID in (123,789).

 ROW                   COLUMN+CELL

 row1                 column=emp:name, timestamp=1321296699190, value=TestName1
 row1                 column=emp:id, timestamp=1321296715892, value=123

 row2                 column=emp:name, timestamp=1321296699190, value=TestName2
 row2                 column=emp:id, timestamp=1321296715892, value=456

 row3                 column=emp:name, timestamp=1321296699190, value=TestName3
 row3                 column=emp:id, timestamp=1321296715892, value=789

 row4                 column=emp:name, timestamp=1321296699190, value=TestName4
 row4                 column=emp:id, timestamp=1321296715892, value=101

 row5                 column=emp:name, timestamp=1321296699190, value=TestName5
 row5                 column=emp:id, timestamp=1321296715892, value=102

I tried using the SingleColumnValueFilter but it is fetching only one record from the table. Given below is my code. Please let me know where I am going wrong:

HTableInterface empTableObj = service.openTable("employee");;
Scan scan = new Scan(startRow, endRow);            

FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ONE);

Integer[] idArray = {123, 789};
for(int i=0;i<idArray.length;i++){
    SingleColumnValueFilter filter = new SingleColumnValueFilter(Bytes.toBytes("emp"), Bytes.toBytes("id"), CompareOp.EQUAL, Bytes.toBytes(idArray[i].toString()));
    filterList.addFilter(filter);
}
scan.setFilter(filterList);
ResultScanner rs = empTableObj.getScanner(scan); 

Thanks

try the other constructor:

SingleColumnValueFilter filter = new SingleColumnValueFilter(family, qualifier, compareOp, empBytes); 

where

compareOp = CompareFilter.CompareOp.EQUAL;

and family,qualifier are in Bytes and empBytes is Bytes.toBytes("emp")

or you can create 2 filters:

SingleColumnValueFilter filterLower = setFilterByCol(CompareOp.GREATER_OR_EQUAL,123);  
SingleColumnValueFilter filterUpper = setFilterByCol(CompareOp.LESS_OR_EQUAL,789);  

and a function:

private static SingleColumnValueFilter setFilterByCol(CompareOp compareOp,int emp) {


        byte[] family = "col_fam_name".getBytes();
        byte[] qualifier = "col_qualifier".getBytes();
        byte[] empByte = // convert emp to empByte...

        SingleColumnValueFilter filter = new SingleColumnValueFilter (family,qualifier,compareOp, empByte );
        filter.setFilterIfMissing(true);
        return filter;
    }

note you also have SingleColumnValueExcludeFilter which allows you to exclude from the scan the column that is used as a filter.

Since the filters are lazily evaluated, I'm guessing you'll have to keep running through next() to scan through all values.

If you know you have 2 values try

rs.next() // for the first value (row1)
rs.next() // again for the second row (row4)

If not sure on how many you'll get .. run it in a loop.

public void testFilterList() {  
    LOG.info("Entering testFilterList.");  

    Table table = null;  
    ResultScanner rScanner = null;  
    try {  
       table = conn.getTable(tableName);  
       Scan scan = new Scan();  
       scan.addColumn(Bytes.toBytes("info"), Bytes.toBytes("name"));  

       // Instantiate a FilterList object in which filters have "and"  
       // relationship with each other.  
       FilterList list = new FilterList(Operator.MUST_PASS_ALL);  
       // Obtain data with EmpId of greater than or equal to 200.  
       list.addFilter(new SingleColumnValueFilter(Bytes.toBytes("info"), Bytes  
           .toBytes("EmpId"), CompareOp.GREATER_OR_EQUAL, Bytes.toBytes(new Long(  
           200))));  
       // Obtain data with EmpId of less than or equal to 1000.  
       list.addFilter(new SingleColumnValueFilter(Bytes.toBytes("info"), Bytes  
           .toBytes("EmpId"), CompareOp.LESS_OR_EQUAL, Bytes.toBytes(new Long(1000))));  

       scan.setFilter(list);  

       // Submit a scan request.  
       rScanner = table.getScanner(scan);  
       // Print query results.  
       for (Result r = rScanner.next(); r != null; r = rScanner.next()) {  
         for (Cell cell : r.rawCells()) {  
           LOG.info(Bytes.toString(CellUtil.cloneRow(cell)) + ":"  
               + Bytes.toString(CellUtil.cloneFamily(cell)) + ","  
               + Bytes.toString(CellUtil.cloneQualifier(cell)) + ","  
               + Bytes.toString(CellUtil.cloneValue(cell)));  
         }  
       }  
       LOG.info("Filter list successfully.");  
     } catch (IOException e) {  
       LOG.error("Filter list failed ", e);  
     } finally {  
         if (rScanner != null) {  
             // Close the scanner object.  
             rScanner.close();  
           }  
       if (table != null) {  
         try {  
           // Close the HTable object.  
           table.close();  
         } catch (IOException e) {  
           LOG.error("Close table failed ", e);  
         }  
       }  
     }  
     LOG.info("Exiting testFilterList.");  
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM