Project

General

Profile

Actions

Feature #5951

open

Create a MariaDB Galera multi-master Cluster using Docker

Added by Neil Voss over 6 years ago. Updated over 5 years ago.

Status:
Assigned
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
08/16/2018
Due date:
% Done:

0%

Estimated time:

Description

MariaDB Galera Cluster is a synchronous multi-master cluster for MariaDB. It is available on Linux only, and only supports the XtraDB/InnoDB storage engines (although there is experimental support for MyISAM - see the wsrep_replicate_myisam system variable).

https://mariadb.com/kb/en/library/what-is-mariadb-galera-cluster/


Files


Related issues 1 (1 open0 closed)

Related to Leginon - Task #5223: migrate sinedon to be compatible with InnoDB sql engineAssignedAnchi Cheng09/20/2017

Actions
Actions #1

Updated by Neil Voss over 6 years ago

  • Related to Task #5223: migrate sinedon to be compatible with InnoDB sql engine added
Actions #3

Updated by Neil Voss over 6 years ago

For the Galera setup, it is recommended to use at least version 10 of MariaDB, which is not native to CentOS 7 (or 6), so it is not the best choice if we stick to CentOS, unless we wait for CentOS 8 to come out, anticipated in Summer 2019. I probably going to test with Debian 9 Stretch for the time being. As we will probably not implement this by the next year anyway.

Starting with MariaDB 10.1, the wsrep API for Galera Cluster is included by default. This is available as a separate download for MariaDB 10.0 and MariaDB 5.5.

Pretty much every guide to setpu Galera on CentOS 7, requires deleting the native MariaDB and installing a different repo. Always concerning with broken updates, etc.

Actions #4

Updated by Neil Voss over 6 years ago

Some interesting notes:

Gaps in auto-increment values:

Do not rely on auto-increment values to be sequential. Galera uses a mechanism based on autoincrement increment to produce unique non-conflicting sequences, so on every single node the sequence will have gaps. See http://codership.blogspot.com/2009/02/managing-auto-increments-with-multi.html

Load balancing: http://galeracluster.com/documentation-webpages/loadbalancing.html

Commit delays:

Synchronous replication works fine over the WAN network. There will be a delay, which is proportional to the network round trip time (RTT), but it only affects the commit operation.

Not sure what this means: https://mariadb.com/kb/en/library/tips-on-converting-to-galera/

There is one latency hit per COMMIT or autocommit. So, combining statements will decrease those hits. On the other hand, it is unwise (for other reasons) to make huge transactions, such as inserting/modifying millions of rows in a single transaction.

Hmm, this is interesting. If I make the same query twice, does it slow down?

Query cache cannot be used in a Galera context.

Query cache is available on newer versions: https://mariadb.com/kb/en/library/query-cache/

The query cache also needs to be disabled for MariaDB Galera cluster versions prior to "5.5.40-galera", "10.0.14-galera" and "10.1.2".

Prior to MariaDB Galera Cluster versions 5.5.40-galera and 10.0.14-galera, the query cache needed to be disabled.

Adding Another Node to a Cluster: https://mariadb.com/kb/en/library/getting-started-with-mariadb-galera-cluster/

Once you have a cluster running and you want to add/reconnect another node to it, you must supply an address of one of the cluster members in the cluster address URL. The new node only needs to connect to one of the existing members. It will automatically retrieve the cluster map and reconnect to the rest of the nodes, of course it's better to list all nodes of the cluster so that any node can join a cluster connecting to any other node, even if one or more are down.

Once all members agree on the membership, state exchange will be initiated during which the new node will be informed of the cluster state. If its state is different from that of the cluster (which is normally the case) it will request a snapshot of the state from the cluster1) and install it before becoming ready for use.

Performance: https://mariadb.com/kb/en/library/mariadb-galera-cluster-known-limitations/

by design performance of the cluster cannot be higher than performance of the slowest node; however, even if you have only one node, its performance can be considerably lower comparing to running the same server in a standalone mode (without wsrep provider). It is particularly true for big enough transactions.

Actions #5

Updated by Neil Voss over 6 years ago

Run a test case, modified from: http://galeracluster.com/2015/05/getting-started-galera-with-docker-part-1/

docker pull erkules/galera

docker run --detach=true --name node1 -h node1 erkules/galera \
   --wsrep-cluster-name=local-test --wsrep-cluster-address=gcomm://

docker run --detach=true --name node2 -h node2 --link node1:node1 erkules/galera \
   --wsrep-cluster-name=local-test --wsrep-cluster-address=gcomm://node1

docker run --detach=true --name node3 -h node3 --link node1:node1 erkules/galera \
   --wsrep-cluster-name=local-test --wsrep-cluster-address=gcomm://node1

docker exec -ti node1 mysql -e "SHOW STATUS LIKE 'wsrep_cluster_%'" 

I managed to log into node1:

docker exec -t -i node1 bash

apt update
apt install wget

wget -c \
'http://emg.nysbc.org/projects/appion/repository/revisions/trunk/raw/appion/docker/centos7/docker-innodb.sql'

mysql -u root < docker-innodb.sql

exit;

log into node2 and check databases:

docker exec -ti node2 mysql -u usr_object --password=Phys-554 -e 'SHOW DATABASES;'

docker exec -ti node3 mysql -u usr_object --password=Phys-554 -e 'SHOW DATABASES;'
Actions #6

Updated by Neil Voss over 6 years ago

Run a test case, modified from: http://galeracluster.com/2015/05/getting-started-galera-with-docker-part-1/

docker build -t galera -f galera.Dockerfile .

docker network create my-net

docker run --detach=true --name node1 -h node1 --network my-net galera \
   --wsrep-cluster-name=local-test --wsrep-cluster-address=gcomm://

for i in {2..5}; do \
docker run --detach=true --name node${i} -h node${i} --network my-net galera \
   --wsrep-cluster-name=local-test --wsrep-cluster-address=gcomm://node1 ; \
sleep 1; done

#wait a few seconds for boot up
docker exec -ti node1 mysql -e "SHOW STATUS LIKE 'wsrep_cluster_%'" 
+--------------------------+--------------------------------------+
| Variable_name            | Value                                |
+--------------------------+--------------------------------------+
| wsrep_cluster_size       | 5                                    |
+--------------------------+--------------------------------------+

I managed to log into node1:

docker exec -t -i node1 bash

time mysql -u root < /docker-innodb.sql

exit;

log into node2 and check databases:

docker exec -ti node2 mysql -u usr_object --password=Phys-554 -e 'SHOW DATABASES;'

docker exec -ti node3 mysql -u usr_object --password=Phys-554 -e 'SHOW DATABASES;'
Actions #7

Updated by Neil Voss over 6 years ago

database creation/insertion time:

  • 3 nodes: 0m4.269s
  • 5 nodes: 0m6.739s
  • 7 nodes: 0m10.384s
  • 9 nodes: 0m28.149s

see full timing in attached image. Database schema creation takes the most time.

Actions #9

Updated by Neil Voss over 6 years ago

I am confident, I could get this working. We do need to implement the InnoDB style of table and make sure that FOREIGN KEY for REF|* columns are setup correctly.

Actions #10

Updated by Neil Voss over 5 years ago

Note: This distributed database, requires at least MariaDB 5.5, which is available on CentOS 7. You cannot set this up easily on CentOS 6.

Actions

Also available in: Atom PDF