Adding a new database to a running MySQL replication setup

Originally posted 5th April, 2011

This is something I did last year, but at the time when I looked, I couldn’t find anybody who had done it before. Hopefully it’s helpful to others out there facing a similar scenario.

The problem I had was this: We have a master server, with maybe 10 databases (some very small), which we want to replicate to a slave.

The easiest way to do this, is simply when commissioning the slave, dump all required databases from the master in one big transaction, set slave’s binlog position, and roll from there. That’s all good. However if any of the following become true, this is no longer feasible:

  • Databases are too large to easily dump/move all in one hit
  • Slave databases get significantly out of sync (yes could use maatkit), or destroyed for some reason
  • Probably most likely — after commissioning the slave, a new database is required to be added to running replication

There are probably other reasons too. Anyway, what I needed was a way to add a database to existing, running replication, without massive interruptions.

My solution was basically to use functionality MySQL already provides, tied together in a script, which now enables me to painlessly and accurately begin replication of a new database from master to slave.

The high-level steps are:

1. Dump from Master

with options — flush-logs — opt — routines — single-transaction — master-data=2

flush-logs to flush all we have to current binlog and start a new one

master-data=2 means we include the binlog file & position information in the dump, but commented out. This means we can restore this dump to the slave with running replication, without inadvertently changing (losing) the slave’s position (yes, I did this a couple of times while developing this, not fun)

2. Restore the backup to the slave

Pretty self-explanatory.

3. Stop slave, flush & retrieve master’s logs

We will now have complete binlogs, including the period between point at which dump was taken, and slave’s current point in replication.

4. Roll the restored database forward to current point in replication

From grepping the backup for the commented-out “change master to master_log_file …” line, we can get the logfile & position, from where binlogs should be executed on the restored database.

From the “show slave status” command, we can get slave’s current executed position in replication, which is the point to which these binlogs should be executed.

After doing this, our restored database will be as it was on the master, at the point replication was stopped.

5. Add new database to replication in my.cnf & restart slave

And smile at your achievement!

So now the useful bit, the code.

In order to use it you’ll need to actually understand it all =) change usernames/passwords, populate the list of DBs you want to add to replication etc

It’s not super-beautiful, but it works for me.

#!/bin/bash
#set -x
# Steps:
# - stop slave
# - flush master logs (starts new log)
# - copy all master logs to (binlogdir) (excluding current/newest one)
# - run this script, it will:
# restore backups
# check backups for when in the master log file/position they were taken
# run binlogs for that db, from the binlogs copied before, from the backup's startpos up to slave's current position
# - start slave
# FOR THE FIRST TIME WE INITIALIZE REPLICATION,
# change master to MASTER_LOG_FILE='';
# script will roll forward the DB to latest from binlogs
#set -xmysqlcmd="mysql -uxxxxx -pxxxxx"
mysqlcmd_master="mysql -uxxxxx -pxxxxx --host=master.yourdomain.com"
gunzipcmd="gunzip"
backupdir="/backup"
# gotta make sure these match ....
binlogdir="/backup/master/binlogs"
#binlogdir=$binlogdir/$binlogsubdir
binlogsubdir="binlogs"
binlogdir_master="root@master.yourdomain.com:/var/lib/mysql/${binlogsubdir}"
rsynccmd="rsync --compress --progress --rsh=ssh --recursive --times"scpcmd="scp root@master.yourdomain.com:"dbs="
DATABASE_NAME_1
DATABASE_NAME_2
DATABASE_NAME_3
"
if [ -n "$1" ] ; then
dbs=$1
fi
# check we have all backups
for db in $dbs ; do
backupfile="${backupdir}/${db}.sql.gz"
#${scpcmd}${backupfile} ${backupfile}
if [ ! -f $backupfile ] ; then
echo "Backup file $backupfile does not exist, trying to get from master ..."
${scpcmd}${backupfile} ${backupfile}
if [ ! -f $backupfile ] ; then
echo "Couldn't get backup file from master, will skip"
#exit
else
echo "... got backup '$backupfile' from master"
fi
fi
done
# get the (stopped) slave's current position,
# and check we've got the binlog for it
slave_status=`$mysqlcmd -e 'show slave status\G'`
if [ -n "`echo "$slave_status" | grep "Running: Yes"`" ] ; then
echo "Slave still running, stopping ..."
$mysqlcmd -e 'stop slave'
echo "... stopped"
fi
# tell master to start a new log, so we can run to the most-recent
echo
echo "Telling master to flush logs ...."
$mysqlcmd_master -e 'flush logs'
echo "... flushed"
echo
# make sure we have master's logs ...
echo
echo "Getting master's logs ...."
$rsynccmd $binlogdir_master/\* $binlogdir
# remove most-recent (i.e. in-progress / newly-created-when-flushed)
#most_recent_binlog=`ls -1 ${binlogdir}/mysql-bin.* | grep -v index | sort -n | tail -n1`
#rm "$most_recent_binlog"
echo "... gottem"
echo
# This is where we want to replay the binlogs to,
# so our newly restored & rolled-forward db is at the same position
# as the entire slave server
# FOR THE FIRST TIME WE INITIALIZE REPLICATION,
# change master to MASTER_LOG_FILE='';
# we'll check for this, and
# If we've reset the slave's position,
# replay to the latest complete binlog we have
slave_master_log_file=`echo "$slave_status" | grep -m1 Relay_Master_Log_File | sed "s/.*Relay_Master_Log_File: \(.*\).*/\1/g"`
if [ -z "$slave_master_log_file" ] ; then
echo "Slave master log file was empty, rolling forward to most recent position we have in binlogs"
slave_master_log_file=`ls -t ${binlogdir}/mysql-bin.* -1 | grep -v index | head -n1`
slave_position_executed=`mysqlbinlog "$slave_master_log_file" | grep "^# at " | tail -n1 | sed 's/^# at \([0-9]\+\).*/\1/g'`
if [ -z "$slave_position_executed" ] ; then
echo "Could not auto determine most-recent position in binlogs, bailing"
exit
fi
slave_master_log_file=`basename "$slave_master_log_file"`
$mysqlcmd -e "change master to MASTER_LOG_FILE='$slave_master_log_file', MASTER_LOG_POS=$slave_position_executed"
echo "Changed master to log file '$slave_master_log_file', pos '$slave_position_executed'"
else
slave_position_executed=`echo "$slave_status" | grep -m1 Exec_Master_Log_Pos | sed "s/.*Exec_Master_Log_Pos: \([0-9]\+\).*/\1/g"`
fi
# get the binlog position at which the backup was taken
# after restoring each DB, we will selectively roll each one forward
# from the time its backup was made, to this most-recent-logged point
for db in $dbs ; do
echo
echo ###########################################################
echo $db
echo ###########################################################
backupfile="${backupdir}/${db}.sql.gz"
if [ ! -f $backupfile ] ; then
echo "No backup file, skipping"
break
fi
# check we have the db on slave
slave_db_check=`$mysqlcmd -e "show databases like '$db'"`
if [ -z "$slave_db_check" ] ; then
echo "Slave DB '$db' not found, creating"
slave_create_db=`$mysqlcmd -e "create database $db"`
if [ -n "$slave_create_db" ] ; then
echo "Couldn't create target db $db, bailing"
exit
fi
fi
backupfile="${backupdir}/${db}.sql.gz"
echo "Restoring DB $db from $backupfile"
slavecmd=`zgrep -m1 -i "^-- change master to master_log_file" $backupfile`
if [ -z "$slavecmd" ] ; then
echo "Could not find master/replication position in backup file. Perhaps the dump was done without --master-data=2. BAILING."
exit
fi
echo "Got slavecmd $slavecmd"
logfile=`echo ${slavecmd} | sed "s/.*MASTER_LOG_FILE='\(.*\)'.*/\1/g"`
echo " Got logfile $logfile"
startpos=`echo ${slavecmd} | sed "s/.*MASTER_LOG_POS=\(.*\);/\1/g"`
#echo " Got startpos $startpos"
# get all logs available and check they're there ...
# UP TO the slave's current position ...
lognum=`echo ${logfile} | sed "s/^[^0-9]\+\.\([0-9]\+\).*/\1/g"`
prefix=`echo $logfile | sed "s/^\([^0-9]\+\.\)[0-9]\+.*/\1/g"`
slave_master_log_file_num=`echo $slave_master_log_file | sed 's/[^0-9]//g'`
# If we don't have the first logfile, bail
num=0000000${i}
num=${lognum:(-6)}
fn=${prefix}${lognum}
if [ ! -e ${binlogdir}/${fn} ] ; then
echo "Hey we don't even have the first binlog required for this backup, bailing"
exit
break
fi
for i in `seq $lognum 99999` ; do
# if we're beyond the slave's current binlog position, exit
if [ $i -gt $slave_master_log_file_num ] ; then
break
fi
num=0000000${i}
num=${num:(-6)}
fn=${prefix}${num}
if [ -e ${binlogdir}/${fn} ] ; then
#echo "found binlog ${binlogdir}/${fn}"
binlogs="${binlogs} ${binlogdir}/${fn}"
else
#echo "did not find ${binlogdir}/${fn}, stop looking"
break
fi
done
echo
echo "Found binlogs: ${binlogs}"
echo
# restore db
echo "Restoring backup from $backupfile to $db ..."
$gunzipcmd < $backupfile | $mysqlcmd $db
echo "... restored"
# roll forwards till at same position as rest of slave
echo "Rolling $db forward from binlogs, from binlog $lognum pos $startpos to binlog $slave_master_log_file_num pos $slave_position_executed ..."
mysqlbinlog --database=$db --start-position=$startpos --stop-position=$slave_position_executed $binlogs | $mysqlcmd $db
echo "... done"
# Add this DB to replication
if [ -z "`grep "replicate-do-db=$db" /etc/my.cnf`" ] ; then
sed "s/^# replicate-new-dbs-here/# replicate-new-dbs-here\nreplicate-do-db=$db/g" -i.bak /etc/my.cnf
echo "Added db $db to mysql replication, will come into effect on restart"
fi
done
# restart mysql, with our newly-added replication
echo "Restart mysql:"
/etc/init.d/mysqld restart
# start slave
echo "Start slave"
$mysqlcmd -e "start slave"
echo fuck yeah.

And that’s it.

Feedback, advice, success/disaster stories all welcome.

Good luck and happy replicating =)

Originally published at nickpdoyle.blogspot.com.

Computer Scientist. Agile Enthusiast. Past lives include Perl Hacker, Web Developer, DBA, Tech Lead, Motorcycle Instructor, Forensic Data Analyst, & Cloud Guy