Some time ago, we decided to run KeyChest on a database cluster. Once we actually did it, I started doubting wisdom of this decision and it took a few months to get beyond the stability of a single-node MySQL database.
I’ve written a blog post that the encryption should be just as simple as databases or networking. Nothing a full-stack developer could ever struggle with. Well, I would say that the encryption should be much simpler.
last summer I started experimenting with MYSQL clusters. There is a really nice technology based on a special NDB database format (the official web site is at https://dev.mysql.com/doc/refman/8.0/en/mysql-cluster.html). It has some really nice properties, like really nice recovery, failover, etc.
From practical point of view though – “NDB is in-memory database”. It is possible to keep only indexes in RAM and keep the database data on disk but you get a significant performance hit. We managed to run the cluster with a smaller test database but when we did a cost calculation for cloud service with the amount of RAM needed, we deemed that infeasible.
There is also one more practical aspect – you need to convert the database from InnoDB to NDB, which will take a lot of time, especially in public cloud, where you have limits on disk IOs.
Fortunately, MySQL introduced version 8 of its database with new tools to manage InnoDB clusters. We have started with version 8.0.4 (I think) and now run 8.0.15. Subjectively I can see improvements in the stability and I’m quite happy with the current version
Before You Decide For Cluster
We had 2 reasons for going for a cluster:
fear of losing a database;
ability to recover from human mistakes like “drop database xxx” (oops, I thought I was logged in to a test server);
Frankly, using just a cluster without good backup doesn’t solve point 2 and only partially point 1. It works for things like a short-term lost of a virtual machine (how often that happens) or screwing up one server as a result of administration mistakes.
Soo – if you decide for a cluster, DO implement it with backups as well. And there are some interesting aspects of how to do (incremental) backups on a database cluster.
And one more point – establish monitoring on day 1! It really is a must if you want to run a reliable production database cluster.
At the end, we settled on Percona’s Xtrabackup. I find it really good – it can do backup without taking the database down. Like really, you run a production system and can create a backup (there are limits to it but it does work).
Now what this tool allows you is to convert a single-node MySQL database to a cluster without downtimes. The installation is simple – for Centos7 we use, we simply downloaded an RPM and ran yum install <file> (ok, we missed compat libraries for MySQL8 but once added, the installation was quick and easy).
Days You Wish Never Started
In the first months of running the cluster, we had 2 main types of problems:
converting single-node database into a cluster
recovering from cluster crashes – this happened from 2 main reasons so far: 1. network loss between cluster nodes 2. running out of disk space
We had more problems with the database during first 30 days of using a cluster than we had over 18 months of running a single-node database!
Recovering from a cluster-node crash can be simple or really painful. Overall, I think it’s getting better with the MySQL version number but you never know. Currently, we are able to solve most problems with:
mysqlsh – really good tool for cluster management – the main issue I have is that either everything works swimmingly with a handful of commands it offers or you’re on your own and have to do stuff with mysql commands.
xtrabackup – the backup tool to restore an older copy of the database.
Converting Single Node to Cluster
There are some good manuals how to create a cluster with an empty database – from scratch. They tend to assume you’re a slightly slower with your brainwaves so you end up with a couple of instructions after 15 minutes of video tutorial (ok, I’m stretching it a wee bit here).
If you already have a database, here is what you need to do:
upgrade to MySQL8 – it has to be done in steps, version by version – we started at 5.5 -> 5.6 -> 5.7 -> 8.0. Only the 5.6 -> 5.7 takes some time as it was rebuilding database formats with data storage changes.
Note: Mysql8 changed how passwords are hashed, it broke KeyChest as it uses Laravel but you can preserve the old hashing by adding “default_authentication_plugin=mysql_native_password” to /etc/my.cnf
When you install MySQL8, install also the cluster tools for MySQL8: mysqlsh, mysqlrouter.
create a user that can login remotely – the best approach is to create a subnet for your cluster, e.g. 10.42.1.0/24 and a new user ‘root’@’10.42.1.%’ or something like that, it will also need privileges that I had to add: “grant system_variables_admin,PERSIST_RO_VARIABLES_ADMIN on *.*to <your user>;
Optional but highly advisable – create a set of domain names for all the nodes: KeyChest has the first cluster “A”, so we have a1, a2, a3.keychest.net – these we have in “/etc/hosts” on all DB nodes.
Review/prepare /etc/my.cnf – my suggestion is to look at: relay_log_space_limit, binlog_expire_logs_seconds, relay-log, max_binlog_size, binlog_format
Note: making /etc/my.cnf writable for mysql allows MySQL tools automatically apply changes!
OK, now you should be running a single-node MySQL8 database and you can do:
mysqlsh \c [email protected] dba.configureLocalInstance() dba.createCluster('mycluster') a=dba.getCluster() a.status()
These commands will have created your first MySQL cluster with one node.
Adding New Nodes To Cluster
Now, here is when it became and remained a real pain for weeks on end. The main problem was that you need to take the database down to create a copy of the database. This is where xtrabackup comes in very handy.
You can keep your production system running and simply create a backup.
xtrabackup -uroot -p --backup --target-dir=/var/mysql/backup/
xtrabackup --prepare --target-dir=/var/mysql/backup/
The backup seems to take somewhere in the region of 2 mins / 10GB of database. The last step prepares the backup for restore. Now you want to copy the database to the new node:
make sure that the target folder is empty – first stop mysqld on the target server!
rsync -avrP /var/mysql/backup/ [email protected]:/var/mysql/db/
adjust ownership: chown -R mysql:mysql /var/mysql/db
and restart mysql server – it should work
As you can see, we have the database files in the “/var/mysql/db” folder. And we need to do a couple of things with mysqlsh on the target server:
\c [email protected]
dba.configureLocalInstance() # this is likely to show something is required, so confirm
\c [email protected] # it takes a few seconds as it restats mysqld
\sql # this switches to SQL mode and you can run SQL commands
SET GLOBAL read_only = OFF;
At this stage, I like to open log file on the target server as MYSQL reporting is pretty weak: “tail -F /var/log/mysqld.log”.
Get / connect back to the source server with mysqlsh (btw, you can use one mysqlsh to connect to any of the cluster nodes – just in case you don’t like too many shell windows).
If lucky, you will see zero errors in the mysqld.log file on the target and a new node in the cluster in the “RECOVERING” state – this should turn into “ONLINE” within a minute or so (it depends how long it took you between making a backup and connecting the target node to the cluster.
Putting Cluster Into Work
OK, you have a cluster up and running – hopefully 🙂 Let’s put it to work now.
bootstrap “mysqlrouter” on each of the nodes:mysqlrouter –bootstrap <db_cluster_admin>@local.your.cluster –user mysqlrouter –directory /etc/mysqlrouter/ –conf-use-sockets –force
systemctl enable mysqlrouter
systemctl status mysqlrouter
The beauty of this is that you can now connect to the local host on port 6446 and mysqlrouter will automatically find master that has RW flag. So if the master goes down, mysqlrouter will automatically redirect your applications to the new master.
The rest is operational – tighten-up firewall rules, regular backups, create and test scripts for recovery – it’s relatively painful to practice it on your new cluster without impacting your application(s), and setup a good monitoring.
Make sure you don’t run out of disk space! 🙂