@InterfaceAudience.Private public class SplitLogManager extends ZooKeeperListener
/hbase/splitlog
. SplitLogWorkers race to grab a task.
SplitLogManager monitors the task znodes that it creates using the
timeoutMonitor thread. If a task's progress is slow then
resubmit(String, Task, ResubmitDirective)
will take away the task from the owner
SplitLogWorker
and the task will be up for grabs again. When the task is done then the
task's znode is deleted by SplitLogManager.
Clients call splitLogDistributed(Path)
to split a region server's
log files. The caller thread waits in this method until all the log files
have been split.
All the zookeeper calls made by this class are asynchronous. This is mainly to help reduce response time seen by the callers.
There is race in this design between the SplitLogManager and the SplitLogWorker. SplitLogManager might re-queue a task that has in reality already been completed by a SplitLogWorker. We rely on the idempotency of the log splitting task for correctness.
It is also assumed that every log splitting task is unique and once completed (either with success or with error) it will be not be submitted again. If a task is resubmitted then there is a risk that old "delete task" can delete the re-submission.
Modifier and Type | Class and Description |
---|---|
static interface |
SplitLogManager.TaskFinisher
SplitLogManager can use objects implementing this interface to
finish off a partially done task by SplitLogWorker . |
Modifier and Type | Field and Description |
---|---|
static int |
DEFAULT_MAX_RESUBMIT |
static int |
DEFAULT_TIMEOUT |
static int |
DEFAULT_UNASSIGNED_TIMEOUT |
static int |
DEFAULT_ZK_RETRIES |
boolean |
ignoreZKDeleteForTesting |
protected ReentrantLock |
recoveringRegionLock
In distributedLogReplay mode, we need touch both splitlog and recovering-regions znodes in one
operation.
|
watcher
Constructor and Description |
---|
SplitLogManager(ZooKeeperWatcher zkw,
org.apache.hadoop.conf.Configuration conf,
Stoppable stopper,
MasterServices master,
ServerName serverName,
boolean masterRecovery)
Wrapper around
SplitLogManager(ZooKeeperWatcher zkw, Configuration conf,
Stoppable stopper, MasterServices master, ServerName serverName,
boolean masterRecovery, TaskFinisher tf)
that provides a task finisher for copying recovered edits to their final destination. |
SplitLogManager(ZooKeeperWatcher zkw,
org.apache.hadoop.conf.Configuration conf,
Stoppable stopper,
MasterServices master,
ServerName serverName,
boolean masterRecovery,
SplitLogManager.TaskFinisher tf)
Its OK to construct this object even when region-servers are not online.
|
Modifier and Type | Method and Description |
---|---|
static void |
deleteRecoveringRegionZNodes(ZooKeeperWatcher watcher,
List<String> regions) |
ZooKeeperProtos.SplitLogTask.RecoveryMode |
getRecoveryMode() |
static ZooKeeperProtos.RegionStoreSequenceIds |
getRegionFlushedSequenceId(ZooKeeperWatcher zkw,
String serverName,
String encodedRegionName)
This function is used in distributedLogReplay to fetch last flushed sequence id from ZK
|
static boolean |
isRegionMarkedRecoveringInZK(ZooKeeperWatcher zkw,
String regionEncodedName)
check if /hbase/recovering-regions/
|
void |
nodeDataChanged(String path)
Called when an existing node has changed data.
|
static long |
parseLastFlushedSequenceIdFrom(byte[] bytes) |
void |
setRecoveryMode(boolean isForInitialization)
This function is to set recovery mode from outstanding split log tasks from before or
current configuration setting
|
long |
splitLogDistributed(List<org.apache.hadoop.fs.Path> logDirs)
The caller will block until all the log files of the given region server
have been processed - successfully split or an error is encountered - by an
available worker region server.
|
long |
splitLogDistributed(org.apache.hadoop.fs.Path logDir) |
long |
splitLogDistributed(Set<ServerName> serverNames,
List<org.apache.hadoop.fs.Path> logDirs,
org.apache.hadoop.fs.PathFilter filter)
The caller will block until all the hbase:meta log files of the given region server
have been processed - successfully split or an error is encountered - by an
available worker region server.
|
void |
stop() |
getWatcher, nodeChildrenChanged, nodeCreated, nodeDeleted
public static final int DEFAULT_TIMEOUT
public static final int DEFAULT_ZK_RETRIES
public static final int DEFAULT_MAX_RESUBMIT
public static final int DEFAULT_UNASSIGNED_TIMEOUT
public boolean ignoreZKDeleteForTesting
protected final ReentrantLock recoveringRegionLock
public SplitLogManager(ZooKeeperWatcher zkw, org.apache.hadoop.conf.Configuration conf, Stoppable stopper, MasterServices master, ServerName serverName, boolean masterRecovery) throws InterruptedIOException, org.apache.zookeeper.KeeperException
SplitLogManager(ZooKeeperWatcher zkw, Configuration conf,
Stoppable stopper, MasterServices master, ServerName serverName,
boolean masterRecovery, TaskFinisher tf)
that provides a task finisher for copying recovered edits to their final destination.
The task finisher has to be robust because it can be arbitrarily restarted or called
multiple times.zkw
- the ZK watcherconf
- the HBase configurationstopper
- the stoppable in case anything is wrongmaster
- the master servicesserverName
- the master server namemasterRecovery
- an indication if the master is in recoveryorg.apache.zookeeper.KeeperException
InterruptedIOException
public SplitLogManager(ZooKeeperWatcher zkw, org.apache.hadoop.conf.Configuration conf, Stoppable stopper, MasterServices master, ServerName serverName, boolean masterRecovery, SplitLogManager.TaskFinisher tf) throws InterruptedIOException, org.apache.zookeeper.KeeperException
zkw
- the ZK watcherconf
- the HBase configurationstopper
- the stoppable in case anything is wrongmaster
- the master servicesserverName
- the master server namemasterRecovery
- an indication if the master is in recoverytf
- task finisherorg.apache.zookeeper.KeeperException
InterruptedIOException
public long splitLogDistributed(org.apache.hadoop.fs.Path logDir) throws IOException
logDir
- one region sever hlog dir path in .logsIOException
- if there was an error while splitting any log fileIOException
public long splitLogDistributed(List<org.apache.hadoop.fs.Path> logDirs) throws IOException
logDirs
- List of log dirs to splitIOException
- If there was an error while splitting any log filepublic long splitLogDistributed(Set<ServerName> serverNames, List<org.apache.hadoop.fs.Path> logDirs, org.apache.hadoop.fs.PathFilter filter) throws IOException
logDirs
- List of log dirs to splitfilter
- the Path filter to select specific files for consideringIOException
- If there was an error while splitting any log filepublic static void deleteRecoveringRegionZNodes(ZooKeeperWatcher watcher, List<String> regions)
public void nodeDataChanged(String path)
ZooKeeperListener
nodeDataChanged
in class ZooKeeperListener
path
- full path of the updated nodepublic void stop()
public static long parseLastFlushedSequenceIdFrom(byte[] bytes)
bytes
- - Content of a failed region server or recovering region znode.public static boolean isRegionMarkedRecoveringInZK(ZooKeeperWatcher zkw, String regionEncodedName) throws org.apache.zookeeper.KeeperException
zkw
- regionEncodedName
- region encode nameorg.apache.zookeeper.KeeperException
public static ZooKeeperProtos.RegionStoreSequenceIds getRegionFlushedSequenceId(ZooKeeperWatcher zkw, String serverName, String encodedRegionName) throws IOException
zkw
- serverName
- encodedRegionName
- serverName
IOException
public void setRecoveryMode(boolean isForInitialization) throws org.apache.zookeeper.KeeperException
isForInitialization
- org.apache.zookeeper.KeeperException
InterruptedIOException
public ZooKeeperProtos.SplitLogTask.RecoveryMode getRecoveryMode()
Copyright © 2014 The Apache Software Foundation. All rights reserved.