b661d115f5
I just spent a while chasing phantom OOM crashes in the backup agent; I'm hoping this comment will help future users avoid my problem. Change-Id: I58ff10d212283b032cabf623021d3955111ff8f0
266 lines
11 KiB
ReStructuredText
266 lines
11 KiB
ReStructuredText
.. _volume_backups:
|
|
|
|
=========================================
|
|
Back up and restore volumes and snapshots
|
|
=========================================
|
|
|
|
The ``openstack`` command-line interface provides the tools for creating a
|
|
volume backup. You can restore a volume from a backup as long as the
|
|
backup's associated database information (or backup metadata) is intact
|
|
in the Block Storage database.
|
|
|
|
Run this command to create a backup of a volume:
|
|
|
|
.. code-block:: console
|
|
|
|
$ openstack volume backup create [--incremental] [--force] VOLUME
|
|
|
|
Where ``VOLUME`` is the name or ID of the volume, ``incremental`` is
|
|
a flag that indicates whether an incremental backup should be performed,
|
|
and ``force`` is a flag that allows or disallows backup of a volume
|
|
when the volume is attached to an instance.
|
|
|
|
Without the ``incremental`` flag, a full backup is created by default.
|
|
With the ``incremental`` flag, an incremental backup is created.
|
|
|
|
Without the ``force`` flag, the volume will be backed up only if its
|
|
status is ``available``. With the ``force`` flag, the volume will be
|
|
backed up whether its status is ``available`` or ``in-use``. A volume
|
|
is ``in-use`` when it is attached to an instance. The backup of an
|
|
``in-use`` volume means your data is crash consistent. The ``force``
|
|
flag is False by default.
|
|
|
|
.. note::
|
|
|
|
The ``force`` flag is new in OpenStack Liberty.
|
|
|
|
The incremental backup is based on a parent backup which is an existing
|
|
backup with the latest timestamp. The parent backup can be a full backup
|
|
or an incremental backup depending on the timestamp.
|
|
|
|
|
|
.. note::
|
|
|
|
The first backup of a volume has to be a full backup. Attempting to do
|
|
an incremental backup without any existing backups will fail.
|
|
There is an ``is_incremental`` flag that indicates whether a backup is
|
|
incremental when showing details on the backup.
|
|
Another flag, ``has_dependent_backups``, returned when showing backup
|
|
details, will indicate whether the backup has dependent backups.
|
|
If it is ``true``, attempting to delete this backup will fail.
|
|
|
|
A new configure option ``backup_swift_block_size`` is introduced into
|
|
``cinder.conf`` for the default Swift backup driver. This is the size in
|
|
bytes that changes are tracked for incremental backups. The existing
|
|
``backup_swift_object_size`` option, the size in bytes of Swift backup
|
|
objects, has to be a multiple of ``backup_swift_block_size``. The default
|
|
is 32768 for ``backup_swift_block_size``, and the default is 52428800 for
|
|
``backup_swift_object_size``.
|
|
|
|
The configuration option ``backup_swift_enable_progress_timer`` in
|
|
``cinder.conf`` is used when backing up the volume to Object Storage
|
|
back end. This option enables or disables the timer. It is enabled by default
|
|
to send the periodic progress notifications to the Telemetry service.
|
|
|
|
This command also returns a backup ID. Use this backup ID when restoring
|
|
the volume:
|
|
|
|
.. code-block:: console
|
|
|
|
$ openstack volume backup restore BACKUP_ID VOLUME_ID
|
|
|
|
When restoring from a full backup, it is a full restore.
|
|
|
|
When restoring from an incremental backup, a list of backups is built based
|
|
on the IDs of the parent backups. A full restore is performed based on the
|
|
full backup first, then restore is done based on the incremental backup,
|
|
laying on top of it in order.
|
|
|
|
You can view a backup list with the :command:`openstack volume backup list`
|
|
command. Optional arguments to clarify the status of your backups
|
|
include: running ``--name``, ``--status``, and
|
|
``--volume`` to filter through backups by the specified name,
|
|
status, or volume-id. Search with ``--all-projects`` for details of the
|
|
projects associated with the listed backups.
|
|
|
|
Because volume backups are dependent on the Block Storage database, you must
|
|
also back up your Block Storage database regularly to ensure data recovery.
|
|
|
|
.. note::
|
|
|
|
Alternatively, you can export and save the metadata of selected volume
|
|
backups. Doing so precludes the need to back up the entire Block Storage
|
|
database. This is useful if you need only a small subset of volumes to
|
|
survive a catastrophic database failure.
|
|
|
|
If you specify a UUID encryption key when setting up the volume
|
|
specifications, the backup metadata ensures that the key will remain valid
|
|
when you back up and restore the volume.
|
|
|
|
For more information about how to export and import volume backup metadata,
|
|
see the section called :ref:`volume_backups_export_import`.
|
|
|
|
By default, the swift object store is used for the backup repository.
|
|
|
|
If instead you want to use an NFS export as the backup repository, add the
|
|
following configuration options to the ``[DEFAULT]`` section of the
|
|
``cinder.conf`` file and restart the Block Storage services:
|
|
|
|
.. code-block:: ini
|
|
|
|
backup_driver = cinder.backup.drivers.nfs
|
|
backup_share = HOST:EXPORT_PATH
|
|
|
|
For the ``backup_share`` option, replace ``HOST`` with the DNS resolvable
|
|
host name or the IP address of the storage server for the NFS share, and
|
|
``EXPORT_PATH`` with the path to that share. If your environment requires
|
|
that non-default mount options be specified for the share, set these as
|
|
follows:
|
|
|
|
.. code-block:: ini
|
|
|
|
backup_mount_options = MOUNT_OPTIONS
|
|
|
|
``MOUNT_OPTIONS`` is a comma-separated string of NFS mount options as detailed
|
|
in the NFS man page.
|
|
|
|
There are several other options whose default values may be overridden as
|
|
appropriate for your environment:
|
|
|
|
.. code-block:: ini
|
|
|
|
backup_compression_algorithm = zlib
|
|
backup_sha_block_size_bytes = 32768
|
|
backup_file_size = 1999994880
|
|
|
|
The option ``backup_compression_algorithm`` can be set to ``zlib``, ``bz2``,
|
|
``zstd`` or ``none``. The value ``none`` can be a useful setting when the
|
|
server providing the share for the backup repository itself performs
|
|
deduplication or compression on the backup data.
|
|
|
|
The option ``backup_file_size`` must be a multiple of
|
|
``backup_sha_block_size_bytes``. It is effectively the maximum file size to be
|
|
used, given your environment, to hold backup data. Volumes larger than this
|
|
will be stored in multiple files in the backup repository. ``backup_file_size``
|
|
also determines the buffer size used to produce backup files; on smaller hosts
|
|
it may need to be scaled down to avoid OOM issues. The
|
|
``backup_sha_block_size_bytes`` option determines the size of blocks from the
|
|
cinder volume being backed up on which digital signatures are calculated in
|
|
order to enable incremental backup capability.
|
|
|
|
You also have the option of resetting the state of a backup. When creating or
|
|
restoring a backup, sometimes it may get stuck in the creating or restoring
|
|
states due to problems like the database or rabbitmq being down. In situations
|
|
like these resetting the state of the backup can restore it to a functional
|
|
status.
|
|
|
|
Run this command to restore the state of a backup:
|
|
|
|
.. code-block:: console
|
|
|
|
$ cinder backup-reset-state [--state STATE] BACKUP_ID-1 BACKUP_ID-2 ...
|
|
|
|
Run this command to create a backup of a snapshot:
|
|
|
|
.. code-block:: console
|
|
|
|
$ openstack volume backup create [--incremental] [--force] \
|
|
[--snapshot SNAPSHOT_ID] VOLUME
|
|
|
|
Where ``VOLUME`` is the name or ID of the volume, ``SNAPSHOT_ID`` is the ID of
|
|
the volume's snapshot.
|
|
|
|
Cancelling
|
|
----------
|
|
|
|
Since Liberty it is possible to cancel an ongoing backup operation on any of
|
|
the Chunked Backup type of drivers such as Swift, NFS, Google, GlusterFS, and
|
|
Posix.
|
|
|
|
To issue a backup cancellation on a backup we must request a force delete on
|
|
the backup.
|
|
|
|
.. code-block:: console
|
|
|
|
$ openstack volume backup delete --force BACKUP_ID
|
|
|
|
.. note::
|
|
|
|
The policy on force delete defaults to admin only.
|
|
|
|
Even if the backup is immediately deleted, and therefore no longer appears in
|
|
the listings, the cancellation may take a little bit longer, so please check
|
|
the status of the source resource to see when it stops being "backing-up".
|
|
|
|
.. note::
|
|
|
|
Before Pike the "backing-up" status would always be stored in the volume,
|
|
even when backing up a snapshot, so when backing up a snapshot any delete
|
|
operation on the snapshot that followed a cancellation could result in an
|
|
error if the snapshot was still mapped. Polling on the volume to stop being
|
|
"backing-up" prior to the deletion is required to ensure success.
|
|
|
|
Since Rocky it is also possible to cancel an ongoing restoring operation on any
|
|
of the Chunked Backup type of drivers.
|
|
|
|
To issue a backup restoration cancellation we need to alter its status to
|
|
anything other than `restoring`. We strongly recommend using the "error" state
|
|
to avoid any confusion on whether the restore was successful or not.
|
|
|
|
.. code-block:: console
|
|
|
|
$ openstack volume backup set --state error BACKUP_ID
|
|
|
|
.. warning::
|
|
|
|
After a restore operation has started, if it is then cancelled, the
|
|
destination volume is useless, as there is no way of knowing how much data,
|
|
or if any, was actually restored, hence our recommendation of using the
|
|
"error" state.
|
|
|
|
backup_max_operations
|
|
---------------------
|
|
|
|
With this configuration option will let us select the maximum number of
|
|
operations, backup and restore, that can be performed concurrently.
|
|
|
|
This option has a default value of 15, which means that we can have 15
|
|
concurrent backups, or 15 concurrent restores, or any combination of backups
|
|
and restores as long as the sum of the 2 operations don't exceed 15.
|
|
|
|
The concurrency limitation of this configuration option is also enforced when
|
|
we run multiple processes for the same backup service using the
|
|
``backup_workers`` configuration option. It is not a per process restriction,
|
|
but global to the service, so we won't be able to run ``backup_max_operations``
|
|
on each one of the processes, but on all the running processes from the same
|
|
backup service.
|
|
|
|
Backups and restore operations are both CPU and memory intensive, but thanks to
|
|
this option we can limit the concurrency and prevent DoS attacks or just
|
|
service disruptions caused by many concurrent requests that lead to Out of
|
|
Memory (OOM) kills.
|
|
|
|
The amount of memory (RAM) used during the operation depends on the configured
|
|
chunk size as well as the compression ratio achieved on the data during the
|
|
operation.
|
|
|
|
Example:
|
|
|
|
Let's have a look at how much memory would be needed if we use the default
|
|
backup chunk size (~1.86 GB) while doing a restore to an RBD volume from a
|
|
non Ceph backend (Swift, NFS etc).
|
|
|
|
In a restore operation the worst case scenario, from the memory point of
|
|
view, is when the compression ratio is close to 0% (the compressed data chunk
|
|
is almost the same size as the uncompressed data).
|
|
|
|
In this case the memory usage would be ~5.58 GB of data for each chunk:
|
|
~5.58 GB = read buffer + decompressed buffer + write buffer used by the
|
|
librbd library = ~1.86 GB + 1.86 GB + 1.86 GB
|
|
|
|
For 15 concurrent restore operations, the cinder-backup service will require
|
|
~83.7 GB of memory.
|
|
|
|
Similar calculations can be done for environment specific scenarios and this
|
|
config option can be set accordingly.
|