Wednesday, September 19, 2012

Weekly report 20120919

Done in last week:

1. Got all signatures for the nomination of candidacy form.

2. Read articles about data locality and compression of HBase.

3. In the HBase inverted index project, added bloom filter to the term count table in the synonym scoring step, and modified some implementations so that terms with count of only 1 are no longer stored in the table. This improved the performance by 6% on a small 3.5GB data set. A more significant improvement is expected for larger data sets.

4. Thought and investigated the impact of very large rows in the inverted index table on the performance and reliability of the whole system. So far it looks fine for our data set.

To do next:

1. Apply the HBase inverted index programs on larger data set.

2. More investigation about abstract data description and sharing model for cloud storage services.

No comments: