Tuesday, January 10, 2012

MockHTable: HTable Simulator with filter problem


While working with Hbase and mainly TDD, we have found difficulty to have hbase running backend before we can start running unit tests. In such cases, MockHTable class comes pretty handy. It implements HTableInterface interface and thus provides all services what an hbase table would provide but in memory. For unit tests, this is very efficient way for testing thus providing a complete transparent way. We replace the table with regular Hbase table in production. I found it as a silver bullet for my daily problems related to hbase running which used to get down for some reason in middle of the day halting development.

But, every convenience comes with a price. I used filters especially PrefixFilter for doing range scans. But, PrefixFilter was not being applied on scanning rather it would return all rows of the table. But, the mysterious thing was that it would work perfectly for other non-row based filters like SingleColumnValueFilter or QualifierFilter. After having look at the getScanner method of MockHTable, I realized that it was applying filterKeyValue for each keyvalue of each row. After some search, I realized this was again a change made on actual code. The actual code used filterRow which seemed pretty reasonable. But, the problem seemed to be with Filter class itself. According to definition, filterRow simply deletes all columns of a row that dont subject to filter condition. But, those involving row id went unnoticed by that call. Since, filterRow had other problems too, we had removed it earlier by filterKeyValue that was a better version of using original filterRow. 

As you can see, for every KeyValue of the row, we are checking if that KeyValue is allowed by filter. This works fine for non-row filters but where we need to apply filter on row id, this fails returning all rows. My problem came for PrefixFilter class. So, need to check for this class and apply row id checking. I used the filterRowKey method passing row id for checking with prefix defined earlier at filter creation. According to its definition, it returns false when that row has to be deleted and true when included. So, applying this fixed my problem for PrefixFilter filter in MockHTable.

No comments:

Post a Comment