HBASE-30180: Can still add data to read-only region after flipping read-only flag multiple times#8280
Conversation
…ad-only flag multiple times Change-Id: I8c5edf6ae4775d36dbbe2b260f77e2168a2a92da
3f2cc32 to
024d94c
Compare
There was a problem hiding this comment.
Pull request overview
Fixes HBASE-30180 where toggling the read-only flag multiple times could leave a region writable. The root cause is that HRegion.this.conf is a CompoundConfiguration distinct from the newConf passed to onConfigurationChange(). The previous code synced the read-only coprocessor list onto newConf and built a new RegionCoprocessorHost from newConf rather than from this.conf, missing the table-descriptor layer and leaving stale data. For HMaster/HRegionServer, this.conf and newConf are the same object, so the issue is region-specific.
Changes:
- Overload
maybeUpdateCoprocessorsto accept a separateconfToUpdate, used for syncing read-only coprocessors and passed into the reload lambda. - Pass
this.conf(theCompoundConfiguration) asconfToUpdateinHRegion.onConfigurationChangeso the newRegionCoprocessorHostis built from the correct config layer; remove the now-unneededupdateCoprocessorListInConfhelper. - Simplify the HMaster/HRegionServer call sites since their
this.conf == newConf.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| hbase-server/src/main/java/org/apache/hadoop/hbase/util/CoprocessorConfigurationUtil.java | Adds confToUpdate parameter to maybeUpdateCoprocessors, removes obsolete updateCoprocessorListInConf helper, gates syncReadOnlyConfigurations on read-only-mode change only. |
| hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java | Passes this.conf (CompoundConfiguration) as confToUpdate and uses it for rebuilding RegionCoprocessorHost. |
| hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java | Simplifies lambda since this.conf == newConf. |
| hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java | Same simplification as HRegionServer. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
taklwu
left a comment
There was a problem hiding this comment.
nit: Would you be able to come up with a unit test to prevent this error from happening?
| CoprocessorConfigurationUtil.syncReadOnlyConfigurations(newConf, coprocessorConfKey); | ||
| reloadTask.reload(newConf); | ||
| componentName, instance); | ||
| if (hasReadOnlyModeChanged) { |
There was a problem hiding this comment.
this reminded me about #7514 were facing the same issue, IMO you're doing the right change.
Change-Id: I11accf45b4e92184f289652bba9a32e7651b5af7
https://issues.apache.org/jira/browse/HBASE-30180
Summary
This pull request fixes an issue with the new Read-Replica feature where if you create a table on the active cluster and then flip the read-only flag multiple times, then sometimes it is still possible to add data to a table despite the cluster being in read-only mode. This issue was happening due to how the configuration for HRegion was being updated after running
update_all_configin the HBase shell.Read-only mode is determined by checking whether the ReadOnly coprocessors are loaded on the cluster. If they are present, then read-only mode is active, and vice-vera if they are not present. The
HRegion.onConfigurationChange()method was not properly updating HRegion's configuration when trying to add/remove the ReadOnly coprocessors. It was updating both HRegion'sthis.confand thenewConfprovided in theonConfigurationChangeMethod(). Furthermore, the incorrect configuration was being provided when updating HRegion'sRegionCoprocessorHost.HRegion's
this.confobject is a special CompoundConfiguration object, so this needs to have the ReadOnly coprocessors added to or removed from directly. After, this CompoundConfiguration should be used to update HRegion'sthis.coprocessorHostobject since this was originally initialized using a CompoundConfiguration.More Details
HMaster and HRegionServer Configuration
When a cluster's configuration is dynamically updated, the configuration is reloaded and all observers are notified via the
HBaseServerBase.updateConfiguration()method. For HMaster and HRegionServer, theirthis.confreferences the same object as thenewConfarg in their respectiveonConfigurationChange()methods. The following custom log messages showing hash codes support this:This means it does not matter whether you update
this.confor thenewConfreference in HMaster's or HRegionServer'sonConfigurationChange()method since they reference the same Configuration object.HRegion Configuration
HRegion's configuration is different from HMaster and HRegionServer. HRegion's
this.confis actually a CompoundConfiguration object, which is a shallow merge of multiple Configurations. It is initialized like so in an HRegion constructor:Based on this initialization and the log messages and hash codes shared above, we see HRegionServer's configuration is the base conf for HRegion's
this.conf. We can also see thenewConfarg inHRegion.onConfigurationChange()is not the same object reference as HRegion'sthis.conf. Since the CompoundConfiguration uses a shallow merge, updating HRegionServer's configuration also updates a layer of the CompoundConfiguration. The original code was incorrectly updatingnewConffor HRegion, so this leads to a race condition where updating HRegionServer also updates all regions automatically, and eventually a region would not be properly updated (such as the ReadOnly coprocessors not being removed when they should have been).In addition, HRegion's
this.coprocessorHostobject was being reinitialized usingnewConf's Configuration object which it should have been using an updated version ofthis.conf's CompoundConfiguration object.