Add hbase-stats project to contrib/#131
Add hbase-stats project to contrib/#131jyates wants to merge 1 commit intoforcedotcom:masterfrom jyates:hbase-stat
Conversation
This is the basis for supporting equal-depth histograms per region, guideposts, #49. Its basically a 0.94 backport of the code attached to HBASE-7958. Logically, there is no real difference, though the implementation is has some slight changes as on trunk there is no need for a coprocessor as the stats gather is all built in. Currently, HBASE-7958 is stil open for review and depends on the the system tables (HBASE-7999) or namespaces (HBASE-8105) patches. We resolve this by just giving the stats table a special name that should be fairly close to the pending name for the stats table in HBASE-7958. In the event that we cannot maintain the same table name for the stats table, we do have an opportunity to copy over the data to the neew table as there is currently a required downtime to upgrade from 0.94 to 0.96. This is actually a bit more advanced than the posted patch - a lot more work has gone into usability and verifying correctness. Further, there are some obvious changes as we need to support coprocessors rather than built options (but that is actually a relatively minor change).
There was a problem hiding this comment.
So in this example, would the row look like this?
Row Key Value
primary\0some-var-len-region-name\0min_region_key 3
primary\0some-var-len-region-name\0max_region_key 10
If a column in the PK is variable length, Phoenix expects it to be null terminated. Are region names variable length too?
One thing we'd be after is to be able to query the stats table through Phoenix. It'll definitely make debugging and troubleshooting easier.
There was a problem hiding this comment.
It would look like this:
primary\0some-var-len-region-name\0some-var-length-column-name | STAT | max_region_key 10
primary\0some-var-len-region-name\0some-var-length-column-name | STAT | min_region_key 3
Right now the stats reader/writer stuff handles reading it in (albeit is still a bit overly complicated IMO). I'd think we could move to using a phoenix based reader and writer the future when we have a configurable writer. I would want to do the configurable writer work in another patch though - that starts to get even more complicated than it already is
This is the basis for supporting equal-depth histograms per region, guideposts, #49.
Its basically a 0.94 backport of the code attached to HBASE-7958. Logically, there is
no real difference, though the implementation is has some slight changes as on trunk there
is no need for a coprocessor as the stats gather is all built in.
Currently, HBASE-7958 is stil open for review and depends on the the system tables (HBASE-7999) or
namespaces (HBASE-8105) patches. We resolve this by just giving the stats table a special name
that should be fairly close to the pending name for the stats table in HBASE-7958. In the event
that we cannot maintain the same table name for the stats table, we do have an opportunity to
copy over the data to the neew table as there is currently a required downtime to upgrade from 0.94
to 0.96.
This is actually a bit more advanced than the posted patch - a lot more work has gone into usability
and verifying correctness. Further, there are some obvious changes as we need to support coprocessors
rather than built options (but that is actually a relatively minor change).