Update replication v2.1 devref
This patch updates the relication developer's documentation to reflect the latest code changes including changes agreed in the Barcelona Design Summit. All drivers implementing relication from now on should follow these guidelines to ensure consistency among all the drivers. Change-Id: I180b89c1ceaeea6d4da8e995e46181990d52825f
This commit is contained in:
parent
5762efb5b3
commit
30261ec485
@ -1,64 +1,88 @@
|
|||||||
Replication
|
Replication
|
||||||
============
|
===========
|
||||||
|
|
||||||
How to implement replication features in a backend driver.
|
For backend devices that offer replication features, Cinder provides a common
|
||||||
|
mechanism for exposing that functionality on a per volume basis while still
|
||||||
|
trying to allow flexibility for the varying implementation and requirements of
|
||||||
|
all the different backend devices.
|
||||||
|
|
||||||
For backend devices that offer replication features, Cinder
|
There are 2 sides to Cinder's replication feature, the core mechanism and the
|
||||||
provides a common mechanism for exposing that functionality
|
driver specific functionality, and in this document we'll only be covering the
|
||||||
on a volume per volume basis while still trying to allow
|
driver side of things aimed at helping vendors implement this functionality in
|
||||||
flexibility for the varying implementation and requirements
|
their drivers in a way consistent with all other drivers.
|
||||||
of all the different backend devices.
|
|
||||||
|
|
||||||
Most of the configuration is done via the cinder.conf file
|
Although we'll be focusing on the driver implementation there will also be some
|
||||||
under the driver section and through the use of volume types.
|
mentions on deployment configurations to provide a clear picture to developers
|
||||||
|
and help them avoid implementing custom solutions to solve things that were
|
||||||
|
meant to be done via the cloud configuration.
|
||||||
|
|
||||||
NOTE:
|
Overview
|
||||||
This implementation is intended to solve a specific use case.
|
--------
|
||||||
It's critical that you read the Use Cases section of the spec
|
|
||||||
here:
|
As a general rule replication is enabled and configured via the cinder.conf
|
||||||
|
file under the driver's section, and volume replication is requested through
|
||||||
|
the use of volume types.
|
||||||
|
|
||||||
|
*NOTE*: Current replication implementation is v2.1 and it's meant to solve a
|
||||||
|
very specific use case, the "smoking hole" scenario. It's critical that you
|
||||||
|
read the Use Cases section of the spec here:
|
||||||
https://specs.openstack.org/openstack/cinder-specs/specs/mitaka/cheesecake.html
|
https://specs.openstack.org/openstack/cinder-specs/specs/mitaka/cheesecake.html
|
||||||
|
|
||||||
Config file examples
|
From a user's perspective volumes will be created using specific volume types,
|
||||||
--------------------
|
even if it is the default volume type, and they will either be replicated or
|
||||||
|
not, which will be reflected on the ``replication_status`` field of the volume.
|
||||||
|
So in order to know if a snapshot is replicated we'll have to check its volume.
|
||||||
|
|
||||||
The cinder.conf file is used to specify replication config info
|
After the loss of the primary storage site all operations on the resources will
|
||||||
for a specific driver. There is no concept of managed vs unmanaged,
|
fail and VMs will no longer have access to the data. It is then when the Cloud
|
||||||
ALL replication configurations are expected to work by using the same
|
Administrator will issue the ``failover-host`` command to make the
|
||||||
driver. In other words, rather than trying to perform any magic
|
cinder-volume service perform the failover.
|
||||||
by changing host entries in the DB for a Volume etc, all replication
|
|
||||||
targets are considered "unmanaged" BUT if a failover is issued, it's
|
|
||||||
the drivers responsibility to access replication volumes on the replicated
|
|
||||||
backend device.
|
|
||||||
|
|
||||||
This results in no changes for the end-user. For example, He/She can
|
After the failover is completed, the Cinder volume service will start using the
|
||||||
still issue an attach call to a replicated volume that has been failed
|
failed-over secondary storage site for all operations and the user will once
|
||||||
over, and the driver will still receive the call BUT the driver will
|
again be able to perform actions on all resources that were replicated, while
|
||||||
need to figure out if it needs to redirect the call to the a different
|
all other resources will be in error status since they are no longer available.
|
||||||
backend than the default or not.
|
|
||||||
|
|
||||||
Information regarding if the backend is in a failed over state should
|
Storage Device configuration
|
||||||
be stored in the driver, and in the case of a restart, the service
|
----------------------------
|
||||||
entry in the DB will have the replication status info and pass it
|
|
||||||
in during init to allow the driver to be set in the correct state.
|
|
||||||
|
|
||||||
In the case of a failover event, and a volume was NOT of type
|
Most storage devices will require configuration changes to enable the
|
||||||
replicated, that volume will now be UNAVAILABLE and any calls
|
replication functionality, and this configuration process is vendor and storage
|
||||||
to access that volume should return a VolumeNotFound exception.
|
device specific so it is not contemplated by the Cinder core replication
|
||||||
|
functionality.
|
||||||
|
|
||||||
**replication_device**
|
It is up to the vendors whether they want to handle this device configuration
|
||||||
|
in the Cinder driver or as a manual process, but the most common approach is to
|
||||||
|
avoid including this configuration logic into Cinder and having the Cloud
|
||||||
|
Administrators do a manual process following a specific guide to enable
|
||||||
|
replication on the storage device before configuring the cinder volume service.
|
||||||
|
|
||||||
Is a multi-dict opt, that should be specified
|
Service configuration
|
||||||
for each replication target device the admin would
|
---------------------
|
||||||
like to configure.
|
|
||||||
|
|
||||||
*NOTE:*
|
The way to enable and configure replication is common to all drivers and it is
|
||||||
|
done via the ``replication_device`` configuration option that goes in the
|
||||||
|
driver's specific section in the ``cinder.conf`` configuration file.
|
||||||
|
|
||||||
There is one standardized and REQUIRED key in the config
|
``replication_device`` is a multi dictionary option, that should be specified
|
||||||
entry, all others are vendor-unique:
|
for each replication target device the admin wants to configure.
|
||||||
|
|
||||||
* backend_id:<vendor-identifier-for-rep-target>
|
While it is true that all drivers use the same ``replication_device``
|
||||||
|
configuration option this doesn't mean that they will all have the same data,
|
||||||
|
as there is only one standardized and **REQUIRED** key in the configuration
|
||||||
|
entry, all others are vendor specific:
|
||||||
|
|
||||||
An example driver config for a device with multiple replication targets
|
- backend_id:<vendor-identifier-for-rep-target>
|
||||||
|
|
||||||
|
Values of ``backend_id`` keys are used to uniquely identify within the driver
|
||||||
|
each of the secondary sites, although they can be reused on different driver
|
||||||
|
sections.
|
||||||
|
|
||||||
|
These unique identifiers will be used by the failover mechanism as well as in
|
||||||
|
the driver initialization process, and the only requirement is that is must
|
||||||
|
never have the value "default".
|
||||||
|
|
||||||
|
An example driver configuration for a device with multiple replication targets
|
||||||
is show below::
|
is show below::
|
||||||
|
|
||||||
.....
|
.....
|
||||||
@ -76,95 +100,376 @@ is show below::
|
|||||||
replication_device = backend_id:vendor-id-1,unique_key:val....
|
replication_device = backend_id:vendor-id-1,unique_key:val....
|
||||||
replication_device = backend_id:vendor-id-2,unique_key:val....
|
replication_device = backend_id:vendor-id-2,unique_key:val....
|
||||||
|
|
||||||
In this example the result is self.configuration.get('replication_device) with the list::
|
In this example the result of calling
|
||||||
|
``self.configuration.safe_get('replication_device)`` within the driver is the
|
||||||
|
following list::
|
||||||
|
|
||||||
[{backend_id: vendor-id-1, unique_key: val1},
|
[{backend_id: vendor-id-1, unique_key: val1},
|
||||||
{backend_id: vendor-id-2, unique_key: val1}]
|
{backend_id: vendor-id-2, unique_key: val2}]
|
||||||
|
|
||||||
|
It is expected that if a driver is configured with multiple replication
|
||||||
|
targets, that replicated volumes are actually replicated on **all targets**.
|
||||||
|
|
||||||
|
Besides specific replication device keys defined in the ``replication_device``,
|
||||||
Volume Types / Extra Specs
|
a driver may also have additional normal configuration options in the driver
|
||||||
---------------------------
|
section related with the replication to allow Cloud Administrators to configure
|
||||||
In order for a user to specify they'd like a replicated volume, there needs to be
|
things like timeouts.
|
||||||
a corresponding Volume Type created by the Cloud Administrator.
|
|
||||||
|
|
||||||
There's a good deal of flexibility by using volume types. The scheduler can
|
|
||||||
send the create request to a backend that provides replication by simply
|
|
||||||
providing the replication=enabled key to the extra-specs of the volume type.
|
|
||||||
|
|
||||||
For example, if the type was set to simply create the volume on any (or if you only had one)
|
|
||||||
backend that supports replication, the extra-specs entry would be::
|
|
||||||
|
|
||||||
{replication: enabled}
|
|
||||||
|
|
||||||
Additionally you could provide additional details using scoped keys::
|
|
||||||
|
|
||||||
{replication: enabled, replication_type: async, replication_count: 2,
|
|
||||||
replication_targets: [fake_id1, fake_id2]}
|
|
||||||
|
|
||||||
It's up to the driver to parse the volume type info on create and set things up
|
|
||||||
as requested. While the scoping key can be anything, it's strongly recommended that all
|
|
||||||
backends utilize the same key (replication) for consistency and to make things easier for
|
|
||||||
the Cloud Administrator.
|
|
||||||
|
|
||||||
Additionally it's expected that if a backend is configured with 3 replication
|
|
||||||
targets, that if a volume of type replication=enabled is issued against that
|
|
||||||
backend then it will replicate to ALL THREE of the configured targets.
|
|
||||||
|
|
||||||
Capabilities reporting
|
Capabilities reporting
|
||||||
----------------------
|
----------------------
|
||||||
The following entries are expected to be added to the stats/capabilities update for
|
|
||||||
replication configured devices::
|
There are 2 new replication stats/capability keys that drivers supporting
|
||||||
|
relication v2.1 should be reporting: ``replication_enabled`` and
|
||||||
|
``replication_targets``::
|
||||||
|
|
||||||
stats["replication_enabled"] = True|False
|
stats["replication_enabled"] = True|False
|
||||||
stats["replication_targets"] = [<backend-id_1, <backend-id_2>...]
|
stats["replication_targets"] = [<backend-id_1, <backend-id_2>...]
|
||||||
|
|
||||||
NOTICE, we report configured replication targets via volume stats_update
|
If a driver is behaving correctly we can expect the ``replication_targets``
|
||||||
This information is added to the get_capabilities admin call.
|
field to be populated whenever ``replication_enabled`` is set to ``True``, and
|
||||||
|
it is expected to either be set to ``[]`` or be missing altogether when
|
||||||
|
``replication_enabled`` is set to ``False``.
|
||||||
|
|
||||||
Required methods
|
The purpose of the ``replication_enabled`` field is to be used by the scheduler
|
||||||
-----------------
|
in volume types for creation and migrations.
|
||||||
The number of API methods associated with replication is intentionally very limited,
|
|
||||||
|
|
||||||
Admin only methods.
|
As for the ``replication_targets`` field it is only provided for informational
|
||||||
|
purposes so it can be retrieved through the ``get_capabilities`` using the
|
||||||
|
admin REST API, but it will not be used for validation at the API layer. That
|
||||||
|
way Cloud Administrators will be able to know available secondary sites where
|
||||||
|
they can failover.
|
||||||
|
|
||||||
They include::
|
Volume Types / Extra Specs
|
||||||
|
---------------------------
|
||||||
|
|
||||||
replication_failover(self, context, volumes)
|
The way to control the creation of volumes on a cloud with backends that have
|
||||||
|
replication enabled is, like with many other features, through the use of
|
||||||
|
volume types.
|
||||||
|
|
||||||
Additionally we have freeze/thaw methods that will act on the scheduler
|
We won't go into the details of volume type creation, but suffice to say that
|
||||||
but may or may not require something from the driver::
|
you will most likely want to use volume types to discriminate between
|
||||||
|
replicated and non replicated volumes and be explicit about it so that non
|
||||||
|
replicated volumes won't end up in a replicated backend.
|
||||||
|
|
||||||
|
Since the driver is reporting the ``replication_enabled`` key, we just need to
|
||||||
|
require it for replication volume types adding ``replication_enabled='<is>
|
||||||
|
True``` and also specifying it for all non replicated volume types
|
||||||
|
``replication_enabled='<is> False'``.
|
||||||
|
|
||||||
|
It's up to the driver to parse the volume type info on create and set things up
|
||||||
|
as requested. While the scoping key can be anything, it's strongly recommended
|
||||||
|
that all backends utilize the same key (replication) for consistency and to
|
||||||
|
make things easier for the Cloud Administrator.
|
||||||
|
|
||||||
|
Additional replication parameters can be supplied to the driver using vendor
|
||||||
|
specific properties through the volume type's extra-specs so they can be used
|
||||||
|
by the driver at volume creation time, or retype.
|
||||||
|
|
||||||
|
It is up to the driver to parse the volume type info on create and retype to
|
||||||
|
set things up as requested. A good pattern to get a custom parameter from a
|
||||||
|
given volume instance is this::
|
||||||
|
|
||||||
|
extra_specs = getattr(volume.volume_type, 'extra_specs', {})
|
||||||
|
custom_param = extra_specs.get('custom_param', 'default_value')
|
||||||
|
|
||||||
|
It may seem convoluted, but we must be careful when retrieving the
|
||||||
|
``extra_specs`` from the ``volume_type`` field as it could be ``None``.
|
||||||
|
|
||||||
|
Vendors should try to avoid obfuscating their custom properties and expose them
|
||||||
|
using the ``_init_vendor_properties`` method so they can be checked by the
|
||||||
|
Cloud Administrator using the ``get_capabilities`` REST API.
|
||||||
|
|
||||||
|
*NOTE*: For storage devices doing per backend/pool replication the use of
|
||||||
|
volume types is also recommended.
|
||||||
|
|
||||||
|
Volume creation
|
||||||
|
---------------
|
||||||
|
|
||||||
|
Drivers are expected to honor the replication parameters set in the volume type
|
||||||
|
during creation, retyping, or migration.
|
||||||
|
|
||||||
|
When implementing the replication feature there are some driver methods that
|
||||||
|
will most likely need modifications -if they are implemented in the driver
|
||||||
|
(since some are optional)- to make sure that the backend is replicating volumes
|
||||||
|
that need to be replicated and not replicating those that don't need to be:
|
||||||
|
|
||||||
|
- ``create_volume``
|
||||||
|
- ``create_volume_from_snapshot``
|
||||||
|
- ``create_cloned_volume``
|
||||||
|
- ``retype``
|
||||||
|
- ``clone_image``
|
||||||
|
- ``migrate_volume``
|
||||||
|
|
||||||
|
In these methods the driver will have to check the volume type to see if the
|
||||||
|
volumes need to be replicated, we could use the same pattern described in the
|
||||||
|
`Volume Types / Extra Specs`_ section::
|
||||||
|
|
||||||
|
def _is_replicated(self, volume):
|
||||||
|
specs = getattr(volume.volume_type, 'extra_specs', {})
|
||||||
|
return specs.get('replication_enabled') == '<is> True'
|
||||||
|
|
||||||
|
But it is **not** the recommended mechanism, and the ``is_replicated`` method
|
||||||
|
available in volumes and volume types versioned objects instances should be
|
||||||
|
used instead.
|
||||||
|
|
||||||
|
Drivers are expected to keep the ``replication_status`` field up to date and in
|
||||||
|
sync with reality, usually as specified in the volume type. To do so in above
|
||||||
|
mentioned methods' implementation they should use the update model mechanism
|
||||||
|
provided for each one of those methods. One must be careful since the update
|
||||||
|
mechanism may be different from one method to another.
|
||||||
|
|
||||||
|
What this means is that most of these methods should be returning a
|
||||||
|
``replication_status`` key with the value set to ``enabled`` in the model
|
||||||
|
update dictionary if the volume type is enabling replication. There is no need
|
||||||
|
to return the key with the value of ``disabled`` if it is not enabled since
|
||||||
|
that is the default value.
|
||||||
|
|
||||||
|
In the case of the ``create_volume``, and ``retype`` method there is no need to
|
||||||
|
return the ``replication_status`` in the model update since it has already been
|
||||||
|
set by the scheduler on creation using the extra spec from the volume type. And
|
||||||
|
on ``migrate_volume`` there is no need either since there is no change to the
|
||||||
|
``replication_status``.
|
||||||
|
|
||||||
|
*NOTE*: For storage devices doing per backend/pool replication it is not
|
||||||
|
necessary to check the volume type for the ``replication_enabled`` key since
|
||||||
|
all created volumes will be replicated, but they are expected to return the
|
||||||
|
``replication_status`` in all those methods, including the ``create_volume``
|
||||||
|
method since the driver may receive a volume creation request without the
|
||||||
|
replication enabled extra spec and therefore the driver will not have set the
|
||||||
|
right ``replication_status`` and the driver needs to correct this.
|
||||||
|
|
||||||
|
Besides the ``replication_status`` field that drivers need to update there are
|
||||||
|
other fields in the database related to the replication mechanism that the
|
||||||
|
drivers can use:
|
||||||
|
|
||||||
|
- ``replication_extended_status``
|
||||||
|
- ``replication_driver_data``
|
||||||
|
|
||||||
|
These fields are string type fields with a maximum size of 255 characters and
|
||||||
|
they are available for drivers to use internally as they see fit for their
|
||||||
|
normal replication operation. So they can be assigned in the model update and
|
||||||
|
later on used by the driver, for example during the failover.
|
||||||
|
|
||||||
|
To avoid using magic strings drivers must use values defined by the
|
||||||
|
``ReplicationsSatus`` class in ``cinder/objects/fields.py`` file and
|
||||||
|
these are:
|
||||||
|
|
||||||
|
- ``ERROR``: When setting the replication failed on creation, retype, or
|
||||||
|
migrate. This should be accompanied by the volume status ``error``.
|
||||||
|
- ``ENABLED``: When the volume is being replicated.
|
||||||
|
- ``DISABLED``: When the volume is not being replicated.
|
||||||
|
- ``FAILED_OVER``: After a volume has been successfully failed over.
|
||||||
|
- ``FAILOVER_ERROR``: When there was an error during the failover of this
|
||||||
|
volume.
|
||||||
|
- ``NOT_CAPABLE``: When we failed-over but the volume was not replicated.
|
||||||
|
|
||||||
|
The first 3 statuses revolve around the volume creation and the last 3 around
|
||||||
|
the failover mechanism.
|
||||||
|
|
||||||
|
The only status that should not be used for the volume's ``replication_status``
|
||||||
|
is the ``FAILING_OVER`` status.
|
||||||
|
|
||||||
|
Whenever we are referring to values of the ``replication_status`` in this
|
||||||
|
document we will be referring to the ``ReplicationStatus`` attributes and not a
|
||||||
|
literal string, so ``ERROR`` means
|
||||||
|
``cinder.objects.field.ReplicationStatus.ERROR`` and not the string "ERROR".
|
||||||
|
|
||||||
|
Failover
|
||||||
|
--------
|
||||||
|
|
||||||
|
This is the mechanism used to instruct the cinder volume service to fail over
|
||||||
|
to a secondary/target device.
|
||||||
|
|
||||||
|
Keep in mind the use case is that the primary backend has died a horrible death
|
||||||
|
and is no longer valid, so any volumes that were on the primary and were not
|
||||||
|
being replicated will no longer be available.
|
||||||
|
|
||||||
|
The method definition required from the driver to implement the failback
|
||||||
|
mechanism is as follows::
|
||||||
|
|
||||||
|
def failover_host(self, context, volumes, secondary_id=None):
|
||||||
|
|
||||||
|
There are several things that are expected of this method:
|
||||||
|
|
||||||
|
- Promotion of a secondary storage device to primary
|
||||||
|
- Generating the model updates
|
||||||
|
- Changing internally to access the secondary storage device for all future
|
||||||
|
requests.
|
||||||
|
|
||||||
|
If no secondary storage device is provided to the driver via the ``backend_id``
|
||||||
|
argument (it is equal to ``None``), then it is up to the driver to choose which
|
||||||
|
storage device to failover to. In this regard it is important that the driver
|
||||||
|
takes into consideration that it could be failing over from a secondary (there
|
||||||
|
was a prior failover request), so it should discard current target from the
|
||||||
|
selection.
|
||||||
|
|
||||||
|
If the ``secondary_id`` is not a valid one the driver is expected to raise
|
||||||
|
``InvalidReplicationTarget``, for any other non recoverable errors during a
|
||||||
|
failover the driver should raise ``UnableToFailOver`` or any child of
|
||||||
|
``VolumeDriverException`` class and revert to a state where the previous
|
||||||
|
backend is in use.
|
||||||
|
|
||||||
|
The failover method in the driver will receive a list of replicated volumes
|
||||||
|
that need to be failed over. Replicated volumes passed to the driver may have
|
||||||
|
diverse ``replication_status`` values, but they will always be one of:
|
||||||
|
``ENABLED``, ``FAILED_OVER``, or ``FAILOVER_ERROR``.
|
||||||
|
|
||||||
|
The driver must return a 2-tuple with the new storage device target id as the
|
||||||
|
first element and a list of dictionaries with the model updates required for
|
||||||
|
the volumes so that the driver can perform future actions on those volumes now
|
||||||
|
that they need to be accessed on a different location.
|
||||||
|
|
||||||
|
It's not a requirement for the driver to return model updates for all the
|
||||||
|
volumes, or for any for that matter as it can return ``None`` or an empty list
|
||||||
|
if there's no update necessary. But if elements are returned in the model
|
||||||
|
update list then it is a requirement that each of the dictionaries contains 2
|
||||||
|
key-value pairs, ``volume_id`` and ``updates`` like this::
|
||||||
|
|
||||||
|
[{
|
||||||
|
'volume_id': volumes[0].id,
|
||||||
|
'updates': {
|
||||||
|
'provider_id': new_provider_id1,
|
||||||
|
...
|
||||||
|
},
|
||||||
|
'volume_id': volumes[1].id,
|
||||||
|
'updates': {
|
||||||
|
'provider_id': new_provider_id2,
|
||||||
|
'replication_status': fields.ReplicationStatus.FAILOVER_ERROR,
|
||||||
|
...
|
||||||
|
},
|
||||||
|
}]
|
||||||
|
|
||||||
|
In these updates there is no need to set the ``replication_status`` to
|
||||||
|
``FAILED_OVER`` if the failover was successful, as this will be performed by
|
||||||
|
the manager by default, but it won't create additional DB queries if it is
|
||||||
|
returned. It is however necessary to set it to ``FAILOVER_ERROR`` for those
|
||||||
|
volumes that had errors during the failover.
|
||||||
|
|
||||||
|
Driver's don't have to worry about snapshots or non replicated volumes, since
|
||||||
|
the manager will take care of those in the following manner:
|
||||||
|
|
||||||
|
- All non replicated volumes will have their current ``status`` field saved in
|
||||||
|
the ``previous_status`` field, the ``status`` field changed to ``error``, and
|
||||||
|
their ``replication_status`` set to ``NOT_CAPABLE``.
|
||||||
|
- All snapshots from non replicated volumes will have their statuses changed to
|
||||||
|
``error``.
|
||||||
|
- All replicated volumes that failed on the failover will get their ``status``
|
||||||
|
changed to ``error``, their current ``status`` preserved in
|
||||||
|
``previous_status``, and their ``replication_status`` set to
|
||||||
|
``FAILOVER_ERROR`` .
|
||||||
|
- All snapshots from volumes that had errors during the failover will have
|
||||||
|
their statuses set to ``error``.
|
||||||
|
|
||||||
|
Any model update request from the driver that changes the ``status`` field will
|
||||||
|
trigger a change in the ``previous_status`` field to preserve the current
|
||||||
|
status.
|
||||||
|
|
||||||
|
Once the failover is completed the driver should be pointing to the secondary
|
||||||
|
and should be able to create and destroy volumes and snapshots as usual, and it
|
||||||
|
is left to the Cloud Administrator's discretion whether resource modifying
|
||||||
|
operations are allowed or not.
|
||||||
|
|
||||||
|
Failback
|
||||||
|
--------
|
||||||
|
|
||||||
|
Drivers are not required to support failback, but they are required to raise a
|
||||||
|
``InvalidReplicationTarget`` exception if the failback is requested but not
|
||||||
|
supported.
|
||||||
|
|
||||||
|
The way to request the failback is quite simple, the driver will receive the
|
||||||
|
argument ``secondary_id`` with the value of ``default``. That is why if was
|
||||||
|
forbidden to use the ``default`` on the target configuration in the cinder
|
||||||
|
configuration file.
|
||||||
|
|
||||||
|
Expected driver behavior is the same as the one explained in the `Failover`_
|
||||||
|
section:
|
||||||
|
|
||||||
|
- Promotion of the original primary to primary
|
||||||
|
- Generating the model updates
|
||||||
|
- Changing internally to access the original primary storage device for all
|
||||||
|
future requests.
|
||||||
|
|
||||||
|
If the failback of any of the volumes fail the driver must return
|
||||||
|
``replication_status`` set to ``ERROR`` in the volume updates for those
|
||||||
|
volumes. If they succeed it is not necessary to change the
|
||||||
|
``replication_status`` since the default behavior will be to set them to
|
||||||
|
``ENABLED``, but it won't create additional DB queries if it is set.
|
||||||
|
|
||||||
|
The manager will update resources in a slightly different way than in the
|
||||||
|
failover case:
|
||||||
|
|
||||||
|
- All non replicated volumes will not have any model modifications.
|
||||||
|
- All snapshots from non replicated volumes will not have any model
|
||||||
|
modifications.
|
||||||
|
- All replicated volumes that failed on the failback will get their ``status``
|
||||||
|
changed to ``error``, have their current ``status`` preserved in the
|
||||||
|
``previous_status`` field, and their ``replication_status`` set to
|
||||||
|
``FAILOVER_ERROR``.
|
||||||
|
- All snapshots from volumes that had errors during the failover will have
|
||||||
|
their statuses set to ``error``.
|
||||||
|
|
||||||
|
We can avoid using the "default" magic string by using the
|
||||||
|
``FAILBACK_SENTINEL`` class attribute from the ``VolumeManager`` class.
|
||||||
|
|
||||||
|
Initialization
|
||||||
|
--------------
|
||||||
|
|
||||||
|
It stands to reason that a failed over Cinder volume service may be restarted,
|
||||||
|
so there needs to be a way for a driver to know on start which storage device
|
||||||
|
should be used to access the resources.
|
||||||
|
|
||||||
|
So, to let drivers know which storage device they should use the manager passes
|
||||||
|
drivers the ``active_backend_id`` argument to their ``__init__`` method during
|
||||||
|
the initialization phase of the driver. Default value is ``None`` when the
|
||||||
|
default (primary) storage device should be used.
|
||||||
|
|
||||||
|
Drivers should store this value if they will need it, as the base driver is not
|
||||||
|
storing it, for example to determine the current storage device when a failover
|
||||||
|
is requested and we are already in a failover state, as mentioned above.
|
||||||
|
|
||||||
|
Freeze / Thaw
|
||||||
|
-------------
|
||||||
|
|
||||||
|
In many cases, after a failover has been completed we'll want to allow changes
|
||||||
|
to the data in the volumes as well as some operations like attach and detach
|
||||||
|
while other operations that modify the number of existing resources, like
|
||||||
|
delete or create, are not allowed.
|
||||||
|
|
||||||
|
And that is where the freezing mechanism comes in; freezing a backend puts the
|
||||||
|
control plane of the specific Cinder volume service into a read only state, or
|
||||||
|
at least most of it, while allowing the data plane to proceed as usual.
|
||||||
|
|
||||||
|
While this will mostly be handled by the Cinder core code, drivers are informed
|
||||||
|
when the freezing mechanism is enabled or disabled via these 2 calls::
|
||||||
|
|
||||||
freeze_backend(self, context)
|
freeze_backend(self, context)
|
||||||
thaw_backend(self, context)
|
thaw_backend(self, context)
|
||||||
|
|
||||||
**replication_failover**
|
In most cases the driver may not need to do anything, and then it doesn't need
|
||||||
|
to define any of these methods as long as its a child class of the ``BaseVD``
|
||||||
|
class that already implements them as noops.
|
||||||
|
|
||||||
Used to instruct the backend to fail over to the secondary/target device.
|
Raising a `VolumeDriverException` exception in any of these methods will result
|
||||||
If not secondary is specified (via backend_id argument) it's up to the driver
|
in a 500 status code response being returned to the caller and the manager will
|
||||||
to choose which device to failover to. In the case of only a single
|
not log the exception, so it's up to the driver to log the error if it is
|
||||||
replication target this argument should be ignored.
|
appropriate.
|
||||||
|
|
||||||
Note that ideally drivers will know how to update the volume reference properly so that Cinder is now
|
If the driver wants to give a more meaningful error response, then it can raise
|
||||||
pointing to the secondary. Also, while it's not required, at this time; ideally the command would
|
other exceptions that have different status codes.
|
||||||
act as a toggle, allowing to switch back and forth between primary and secondary and back to primary.
|
|
||||||
|
|
||||||
Keep in mind the use case is that the backend has died a horrible death and is
|
When creating the `freeze_backend` and `thaw_backend` driver methods we must
|
||||||
no longer valid. Any volumes that were on the primary and NOT of replication
|
remember that this is a Cloud Administrator operation, so we can return errors
|
||||||
type should now be unavailable.
|
that reveal internals of the cloud, for example the type of storage device, and
|
||||||
|
we must use the appropriate internationalization translation methods when
|
||||||
|
raising exceptions; for `VolumeDriverException` no translation is necessary
|
||||||
|
since the manager doesn't log it or return to the user in any way, but any
|
||||||
|
other exception should use the ``_()`` translation method since it will be
|
||||||
|
returned to the REST API caller.
|
||||||
|
|
||||||
NOTE: We do not expect things like create requests to go to the driver and
|
For example, if a storage device doesn't support the thaw operation when failed
|
||||||
magically create volumes on the replication target. The concept is that the
|
over, then it should raise an `Invalid` exception::
|
||||||
backend is lost, and we're just providing a DR mechanism to preserve user data
|
|
||||||
for volumes that were specified as such via type settings.
|
|
||||||
|
|
||||||
**freeze_backend**
|
def thaw_backend(self, context):
|
||||||
|
if self.failed_over:
|
||||||
Puts a backend host/service into a R/O state for the control plane. For
|
msg = _('Thaw is not supported by driver XYZ.')
|
||||||
example if a failover is issued, it is likely desirable that while data access
|
raise exception.Invalid(msg)
|
||||||
to existing volumes is maintained, it likely would not be wise to continue
|
|
||||||
doing things like creates, deletes, extends etc.
|
|
||||||
|
|
||||||
**thaw_backend**
|
|
||||||
|
|
||||||
Clear frozen control plane on a backend.
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user