KATO Tomoyuki d1fe08b4a7 [ha-guide] Reorganize basic services content

To sync with installation guides.

Change-Id: Ia4df2bdb1f058bb3d8bcf035160463134d115384
Implements: blueprint improve-ha-guide

2016-05-24 07:17:39 +09:00

7.7 KiB

Raw Blame History

Management

When you finish the installation and configuration process on each cluster node in your OpenStack database, you can initialize Galera Cluster.

Before you attempt this, verify that you have the following ready:

Database hosts with Galera Cluster installed. You need a minimum of three hosts;
No firewalls between the hosts;
SELinux and AppArmor set to permit access to mysqld;
The correct path to libgalera_smm.so given to the wsrep_provider parameter.

Initializing the cluster

In Galera Cluster, the Primary Component is the cluster of database servers that replicate into each other. In the event that a cluster node loses connectivity with the Primary Component, it defaults into a non-operational state, to avoid creating or serving inconsistent data.

By default, cluster nodes do not start as part of a Primary Component. Instead they assume that one exists somewhere and attempts to establish a connection with it. To create a Primary Component, you must start one cluster node using the --wsrep-new-cluster option. You can do this using any cluster node, it is not important which you choose. In the Primary Component, replication and state transfers bring all databases to the same state.

To start the cluster, complete the following steps:

Initialize the Primary Component on one cluster node. For servers that use init, run the following command:
```
# service mysql start --wsrep-new-cluster
```
For servers that use systemd, instead run this command:
```
# systemctl start mysql --wsrep-new-cluster
```

Once the database server starts, check the cluster status using the wsrep_cluster_size status variable. From the database client, run the following command:

SHOW STATUS LIKE 'wsrep_cluster_size';

+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 1     |
+--------------------+-------+

Start the database server on all other cluster nodes. For servers that use init, run the following command:
```
# service mysql start
```
For servers that use systemd, instead run this command:
```
# systemctl start mysql
```

When you have all cluster nodes started, log into the database client on one of them and check the wsrep_cluster_size status variable again.

SHOW STATUS LIKE 'wsrep_cluster_size';

+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 3     |
+--------------------+-------+

When each cluster node starts, it checks the IP addresses given to the wsrep_cluster_address parameter and attempts to establish network connectivity with a database server running there. Once it establishes a connection, it attempts to join the Primary Component, requesting a state transfer as needed to bring itself into sync with the cluster.

In the event that you need to restart any cluster node, you can do so. When the database server comes back it, it establishes connectivity with the Primary Component and updates itself to any changes it may have missed while down.

Restarting the cluster

Individual cluster nodes can stop and be restarted without issue. When a database loses its connection or restarts, Galera Cluster brings it back into sync once it reestablishes connection with the Primary Component. In the event that you need to restart the entire cluster, identify the most advanced cluster node and initialize the Primary Component on that node.

To find the most advanced cluster node, you need to check the sequence numbers, or seqnos, on the last committed transaction for each. You can find this by viewing grastate.dat file in database directory,

$ cat /path/to/datadir/grastate.dat

# Galera saved state
version: 3.8
uuid:    5ee99582-bb8d-11e2-b8e3-23de375c1d30
seqno:   8204503945773

Alternatively, if the database server is running, use the wsrep_last_committed status variable:

SHOW STATUS LIKE 'wsrep_last_committed';

+----------------------+--------+
| Variable_name        | Value  |
+----------------------+--------+
| wsrep_last_committed | 409745 |
+----------------------+--------+

This value increments with each transaction, so the most advanced node has the highest sequence number, and therefore is the most up to date.

Configuration tips

Deployment strategies

Galera can be configured using one of the following strategies:

Each instance has its own IP address;

OpenStack services are configured with the list of these IP addresses so they can select one of the addresses from those available.
Galera runs behind HAProxy.

HAProxy load balances incoming requests and exposes just one IP address for all the clients.

Galera synchronous replication guarantees a zero slave lag. The failover procedure completes once HAProxy detects that the active back end has gone down and switches to the backup one, which is then marked as 'UP'. If no back ends are up (in other words, the Galera cluster is not ready to accept connections), the failover procedure finishes only when the Galera cluster has been successfully reassembled. The SLA is normally no more than 5 minutes.
Use MySQL/Galera in active/passive mode to avoid deadlocks on SELECT ... FOR UPDATE type queries (used, for example, by nova and neutron). This issue is discussed more in the following:
- http://lists.openstack.org/pipermail/openstack-dev/2014-May/035264.html
- http://www.joinfu.com/

Of these options, the second one is highly recommended. Although Galera supports active/active configurations, we recommend active/passive (enforced by the load balancer) in order to avoid lock contention.

Configuring HAProxy

If you use HAProxy for load-balancing client access to Galera Cluster as described in the controller-ha-haproxy, you can use the clustercheck utility to improve health checks.

Create a configuration file for clustercheck at /etc/sysconfig/clustercheck:

MYSQL_USERNAME="clustercheck_user"
MYSQL_PASSWORD="my_clustercheck_password"
MYSQL_HOST="localhost"
MYSQL_PORT="3306"

Log in to the database client and grant the clustercheck user PROCESS privileges.
```
GRANT PROCESS ON *.* TO 'clustercheck_user'@'localhost'
IDENTIFIED BY 'my_clustercheck_password';

FLUSH PRIVILEGES;
```
You only need to do this on one cluster node. Galera Cluster replicates the user to all the others.

Create a configuration file for the HAProxy monitor service, at /etc/xinetd.d/galera-monitor:

service galera-monitor
{
   port = 9200
   disable = no
   socket_type = stream
   protocol = tcp
   wait = no
   user = root
   group = root
   groups = yes
   server = /usr/bin/clustercheck
   type = UNLISTED
   per_source = UNLIMITED
   log_on_success =
   log_on_failure = HOST
   flags = REUSE
}

Start the xinetd daemon for clustercheck. For servers that use init, run the following commands:
```
# service xinetd enable
# service xinetd start
```
For servers that use systemd, instead run these commands:
```
# systemctl daemon-reload
# systemctl enable xinetd
# systemctl start xinetd
```

7.7 KiB Raw Blame History