Done in last week:
1. Investigated the impact of very large rows in our project, and found that it would make several programs not work since they try to get whole rows sometimes. If the row size exceeded the memory limit of the tasks, the program would fail. I solved this problem by avoiding getting whole rows with HTable.get(), and setting the "batch" property to something like 10000 when scanning the tables.
2. Tried to run the project with a larger data set - 11GB compressed, ~40GB after put into HBase tables. Got some errors from HBase region servers, and just investigating the issue. Possible reasons include incompatible versions of Java and limited heap size.
To do next:
Solve the issues with large data set and test on larger scale.
Friday, September 28, 2012
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment