Pegasus’s Meta Server uses Zookeeper to store metadata and leader election, so the instability of the Zookeeper service can cause instability in Pegasus. If necessary, Pegasus metadata needs to be migrated to other more stable or idle Zookeeper.
There are two ways to migrate Zookeeper metadata: through metadata recovery, or through the zkcopy
tool.
Migration through metadata recovery
Pegasus provides Metadata Recovery function, it can also be used for Zookeeper migration. The basic idea is to configure a new Zookeeper list and perform metadata recovery through the recover
command, then the metadata is migrated to the new Zookeeper.
-
Backup table list
Use the
ls
command of the shell tools:>>> ls -o apps.list
-
Backup node list
Use the
nodes
command of the shell tools:>>> nodes -d -o nodes.list
Generate the
recover_node_list
file required for metadata recovery:grep ALIVE nodes.list | awk '{print $1}' > recover_node_list
-
Stop all Meta Servers
Stop all Meta Servers, and wait for a period of time (default to 30 seconds, depending on configuration
[replication]config_sync_interval_ms
) to ensure that all Replica Servers enter theINACTIVE
state due to the beacon timeout. -
Modifying Meta Server configuration file
The modified content is as follows:
[meta_server] recover_from_replica_server = true [zookeeper] hosts_list = {new Zookeeper host list}
They mean:
- Set
recover_from_replica_server
totrue
and enable to recover metadata from Replica Servers - Update Zookeeper configuration to the new service addresses
- Set
-
Start a Meta Server
Start a Meta Server in the cluster, it will become the leader Meta Server of the cluster.
-
Use the
recover
command of the shell tools>>> recover -f recover_node_list
-
Modify the configuration file and restart the Meta Server
After successful recovery, it is necessary to modify the configuration file of the Meta Server and reset to non-recovery state:
[meta_server] recover_from_replica_server = false
-
Restart all Meta Servers, then the cluster enters the normal state.
Sample script
Refer to the main process in the sample script pegasus_migrate_zookeeper.sh for Zookeeper metadata migration.
Migration through the zkcopy
tool
The basic idea is to use zkcopy tool to copy the Pegasus metadata from the original Zookeeper to the target Zookeeper, modify the Meta Server configuration file, and restart the cluster.
-
Stop all follower Meta Servers
In order to prevent other follower Meta Servers from requiring the lock and becoming the new leader when restarting the leader Meta Server, causing metadata inconsistency, it is necessary to keep only the leader Meta Server in live state and stop all other follower Meta Servers throughout the entire migration process.
-
Modify the leader Meta Server status to
blind
Set the leader Meta Server’s meta_level to
blind
, to prohibit any update operations on Zookeeper data and prevent metadata inconsistency during the migration process:>>> set_meta_level blind
For an introduction to Meta Server’s meta_level, please refer to Rebalance.
-
Use the
zkcopy
tool to copy Zookeeper metadataObtain the path
zookeeper_root
where Pegasus metadata is stored on the Zookeeper through thecluster_info
command of the shell tools, and then use thezkcopy
tool to copy all the data from this path to the new Zookeeper, being careful to recursively copy. -
Modify configuration file
Modify the configuration file of Meta Servers and change the
hosts_lists
value to the new service addresses:[meta_server] hosts_list = {new Zookeeper host list}
-
Restart the leader Meta Server
Restart the leader Meta Server and use shell tools to check that the cluster has entered the normal state.
-
Restart all follower Meta Servers
Start all follower Meta Servers and check the cluster enters the normal state.
-
Clean up data on old Zookeepers
Use the
rmr
command of the zookeepercli tool to clean up data on old Zookeepers.