openstack-manuals/doc/ha-guide/source/storage-ha-block.rst
KATO Tomoyuki ca2e0298ad [ha-guide] consolidate service names
Change-Id: I12c56383775c2d679286ef2deae93836b407aa35
Implements: blueprint improve-ha-guide
2016-05-23 06:44:15 +00:00

8.6 KiB
Raw Blame History

Highly available Block Storage API

Cinder provides 'block storage as a service' suitable for performance sensitive scenarios such as databases, expandable file systems, or providing a server with access to raw block level storage.

Persistent block storage can survive instance termination and can also be moved across instances like any external storage device. Cinder also has volume snapshots capability for backing up the volumes.

Making this Block Storage API service highly available in active/passive mode involves:

  • ha-blockstorage-pacemaker
  • ha-blockstorage-configure
  • ha-blockstorage-services

In theory, you can run the Block Storage service as active/active. However, because of sufficient concerns, it is recommended running the volume component as active/passive only.

Jon Bernard writes:

Requests are first seen by Cinder in the API service, and we have a
fundamental problem there - a standard test-and-set race condition
exists for many operations where the volume status is first checked
for an expected status and then (in a different operation) updated to
a pending status. The pending status indicates to other incoming
requests that the volume is undergoing a current operation, however it
is possible for two simultaneous requests to race here, which
undefined results.

Later, the manager/driver will receive the message and carry out the
operation. At this stage there is a question of the synchronization
techniques employed by the drivers and what guarantees they make.

If cinder-volume processes exist as different process, then the
'synchronized' decorator from the lockutils package will not be
sufficient. In this case the programmer can pass an argument to
synchronized() 'external=True'. If external is enabled, then the
locking will take place on a file located on the filesystem. By
default, this file is placed in Cinder's 'state directory' in
/var/lib/cinder so won't be visible to cinder-volume instances running
on different machines.

However, the location for file locking is configurable. So an
operator could configure the state directory to reside on shared
storage. If the shared storage in use implements unix file locking
semantics, then this could provide the requisite synchronization
needed for an active/active HA configuration.

The remaining issue is that not all drivers use the synchronization
methods, and even fewer of those use the external file locks. A
sub-concern would be whether they use them correctly.

You can read more about these concerns on the Red Hat Bugzilla and there is a psuedo roadmap for addressing them upstream.

Add Block Storage API resource to Pacemaker

On RHEL-based systems, you should create resources for cinder's systemd agents and create constraints to enforce startup/shutdown ordering:

pcs resource create openstack-cinder-api systemd:openstack-cinder-api --clone interleave=true
pcs resource create openstack-cinder-scheduler systemd:openstack-cinder-scheduler --clone interleave=true
pcs resource create openstack-cinder-volume systemd:openstack-cinder-volume

pcs constraint order start openstack-cinder-api-clone then openstack-cinder-scheduler-clone
pcs constraint colocation add openstack-cinder-scheduler-clone with openstack-cinder-api-clone
pcs constraint order start openstack-cinder-scheduler-clone then openstack-cinder-volume
pcs constraint colocation add openstack-cinder-volume with openstack-cinder-scheduler-clone

If the Block Storage service runs on the same nodes as the other services, then it is advisable to also include:

pcs constraint order start openstack-keystone-clone then openstack-cinder-api-clone

Alternatively, instead of using systemd agents, download and install the OCF resource agent:

# cd /usr/lib/ocf/resource.d/openstack
# wget https://git.openstack.org/cgit/openstack/openstack-resource-agents/plain/ocf/cinder-api
# chmod a+rx *

You can now add the Pacemaker configuration for Block Storage API resource. Connect to the Pacemaker cluster with the crm configure command and add the following cluster resources:

primitive p_cinder-api ocf:openstack:cinder-api \
   params config="/etc/cinder/cinder.conf"
   os_password="secretsecret"
   os_username="admin" \
   os_tenant_name="admin"
   keystone_get_token_url="http://10.0.0.11:5000/v2.0/tokens" \
   op monitor interval="30s" timeout="30s"

This configuration creates p_cinder-api, a resource for managing the Block Storage API service.

The command crm configure supports batch input, so you may copy and paste the lines above into your live pacemaker configuration and then make changes as required. For example, you may enter edit p_ip_cinder-api from the crm configure menu and edit the resource to match your preferred virtual IP address.

Once completed, commit your configuration changes by entering commit from the crm configure menu. Pacemaker then starts the Block Storage API service and its dependent resources on one of your nodes.

Configure Block Storage API service

Edit the /etc/cinder/cinder.conf file:

On a RHEL-based system, it should look something like:

[DEFAULT]
# This is the name which we should advertise ourselves as and for
# A/P installations it should be the same everywhere
host = cinder-cluster-1

# Listen on the Block Storage VIP
osapi_volume_listen = 10.0.0.11

auth_strategy = keystone
control_exchange = cinder

volume_driver = cinder.volume.drivers.nfs.NfsDriver
nfs_shares_config = /etc/cinder/nfs_exports
nfs_sparsed_volumes = true
nfs_mount_options = v3

[database]
sql_connection = mysql://cinder:CINDER_DBPASS@10.0.0.11/cinder
max_retries = -1

[keystone_authtoken]
# 10.0.0.11 is the Keystone VIP
identity_uri = http://10.0.0.11:35357/
auth_uri = http://10.0.0.11:5000/
admin_tenant_name = service
admin_user = cinder
admin_password = CINDER_PASS

[oslo_messaging_rabbit]
# Explicitly list the rabbit hosts as it doesn't play well with HAProxy
rabbit_hosts = 10.0.0.12,10.0.0.13,10.0.0.14
# As a consequence, we also need HA queues
rabbit_ha_queues = True
heartbeat_timeout_threshold = 60
heartbeat_rate = 2

Replace CINDER_DBPASS with the password you chose for the Block Storage database. Replace CINDER_PASS with the password you chose for the cinder user in the Identity service.

This example assumes that you are using NFS for the physical storage, which will almost never be true in a production installation.

If you are using the Block Storage service OCF agent, some settings will be filled in for you, resulting in a shorter configuration file:

# We have to use MySQL connection to store data:
sql_connection = mysql://cinder:CINDER_DBPASS@10.0.0.11/cinder
# Alternatively, you can switch to pymysql,
# a new Python 3 compatible library and use
# sql_connection = mysql+pymysql://cinder:CINDER_DBPASS@10.0.0.11/cinder
# and be ready when everything moves to Python 3.
# Ref: https://wiki.openstack.org/wiki/PyMySQL_evaluation

# We bind Block Storage API to the VIP:
osapi_volume_listen = 10.0.0.11

# We send notifications to High Available RabbitMQ:
notifier_strategy = rabbit
rabbit_host = 10.0.0.11

Replace CINDER_DBPASS with the password you chose for the Block Storage database.

Configure OpenStack services to use highly available Block Storage API

Your OpenStack services must now point their Block Storage API configuration to the highly available, virtual cluster IP address rather than a Block Storage API servers physical IP address as you would for a non-HA environment.

You must create the Block Storage API endpoint with this IP.

If you are using both private and public IP addresses, you should create two virtual IPs and define your endpoint like this:

$ openstack endpoint create volume --region $KEYSTONE_REGION \
--publicurl 'http://PUBLIC_VIP:8776/v1/%(tenant_id)s' \
--adminurl 'http://10.0.0.11:8776/v1/%(tenant_id)s' \
--internalurl 'http://10.0.0.11:8776/v1/%(tenant_id)s'