Merge "Follow-up to rolling upgrade docs"
This commit is contained in:
commit
aa1abbdcd6
@ -40,6 +40,14 @@ Plan your upgrade
|
||||
database. Hence, in case of upgrade failure, restoring the database from
|
||||
a backup is the only choice.
|
||||
|
||||
* Before starting your upgrade, it is best to ensure that all nodes have
|
||||
reached, or are in, a stable ``provision_state``. Nodes in states with
|
||||
long running processes such as deploying or cleaning, may fail, and may
|
||||
require manual intervention to return them to the available hardware pool.
|
||||
This is most likely in cases where a timeout has occurred or a service was
|
||||
terminated abruptly. For a visual diagram detailing states and possible
|
||||
state transitions, please see the
|
||||
`state machine diagram <https://docs.openstack.org/ironic/latest/contributor/states.html>`_.
|
||||
|
||||
Offline upgrades
|
||||
================
|
||||
@ -75,21 +83,34 @@ Once the above is done, do the following:
|
||||
limits the number of migrations executed in one run. You should complete
|
||||
all of the migrations as soon as possible after the upgrade.
|
||||
|
||||
.. warning:: You will not be able to start an upgrade to the next release
|
||||
after this one, until this has been completed for the current
|
||||
release. For example, as part of upgrading from Ocata to Pike,
|
||||
you need to complete Pike's data migrations. If this not done,
|
||||
you will not be able to upgrade to Queens -- it will not be
|
||||
possible to execute Queens' database schema updates.
|
||||
.. warning::
|
||||
You will not be able to start an upgrade to the release
|
||||
after this one, until this has been completed for the current
|
||||
release. For example, as part of upgrading from Ocata to Pike,
|
||||
you need to complete Pike's data migrations. If this not done,
|
||||
you will not be able to upgrade to Queens -- it will not be
|
||||
possible to execute Queens' database schema updates.
|
||||
|
||||
|
||||
Rolling upgrades
|
||||
================
|
||||
|
||||
Rolling upgrades are available starting with the Pike release; that is, when
|
||||
upgrading from Ocata. This means that it is possible to do an upgrade with
|
||||
To Reduce downtime, the services can be upgraded in a rolling fashion, meaning
|
||||
to upgrade one or a few services at a time to minimize impact.
|
||||
|
||||
Rolling upgrades are available starting with the Pike release. This feature
|
||||
makes it possible to upgrade between releases, such as Ocata to Pike, with
|
||||
minimal to no downtime of the Bare Metal API.
|
||||
|
||||
Requirements
|
||||
------------
|
||||
|
||||
To facilitate an upgrade in a rolling fashion, you need to have a
|
||||
highly-available deployment consisting of at least two ironic-api
|
||||
and two ironic-conductor services.
|
||||
Use of a load balancer to balance requests across the ironic-api
|
||||
services is recommended, as it allows for a minimal impact to end users.
|
||||
|
||||
Concepts
|
||||
--------
|
||||
|
||||
@ -110,19 +131,27 @@ that the older services are using. The newer services will backport RPC calls
|
||||
and objects to their appropriate versions from the pinned release. If the
|
||||
``IncompatibleObjectVersion`` exception occurs, it is most likely due to an
|
||||
incorrect or unspecified ``[DEFAULT]/pin_release_version`` configuration value.
|
||||
For example, when it is not set to the older release version, no conversion
|
||||
will happen during the upgrade.
|
||||
For example, when ``[DEFAULT]/pin_release_version`` is not set to the older
|
||||
release version, no conversion will happen during the upgrade.
|
||||
|
||||
Online data migrations
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
To make database schema migrations less painful to execute, all data migrations
|
||||
are banned from schema migration scripts. The schema migration scripts only
|
||||
update the database schema. Data migrations must be done at the end of the
|
||||
rolling upgrade process, after the schema migration and after the services
|
||||
have been upgraded to the latest release. The data migration is performed
|
||||
using the ``ironic-dbsync online_data_migrations`` command. It can be run in
|
||||
a background process so that it does not interrupt running services.
|
||||
To make database schema migrations less painful to execute, we have
|
||||
implemented process changes to facilitate upgrades.
|
||||
|
||||
* All data migrations are banned from schema migration scripts.
|
||||
* Schema migration scripts only update the database schema.
|
||||
* Data migrations must be done at the end of the rolling upgrade process,
|
||||
after the schema migration and after the services have been upgraded to
|
||||
the latest release.
|
||||
|
||||
All data migrations are performed using the
|
||||
``ironic-dbsync online_data_migrations`` command. It can be run as
|
||||
a background process so that it does not interrupt running services;
|
||||
however it must be run to completion for a cold upgrade if the intent
|
||||
is to make use of new features immediately.
|
||||
|
||||
(You would also execute the same command with services turned off if
|
||||
you are doing a cold upgrade).
|
||||
|
||||
@ -132,8 +161,8 @@ Pike but did not do the data migrations, you will not be able to upgrade from
|
||||
Pike to Queens. (More precisely, you will not be able to apply Queens' schema
|
||||
migrations.)
|
||||
|
||||
Graceful service shutdown
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Graceful conductor service shutdown
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The ironic-conductor service is a Python process listening for messages on a
|
||||
message queue. When the operator sends the SIGTERM signal to the process, the
|
||||
@ -147,6 +176,12 @@ older code, and start up a service using newer code with minimal impact.
|
||||
This was tested with RabbitMQ messaging backend and may vary with other
|
||||
backends.
|
||||
|
||||
Nodes that are being acted upon by an ironic-conductor process, which are
|
||||
not in a stable state, may encounter failures. Node failures that occur
|
||||
during an upgrade are likely due to timeouts, resulting from delays
|
||||
involving messages being processed and acted upon by a conductor
|
||||
during long running, multi-step processes such as deployment or cleaning.
|
||||
|
||||
API load balancer draining
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
@ -157,13 +192,9 @@ services that have not yet been upgraded.
|
||||
Rolling upgrade process
|
||||
-----------------------
|
||||
|
||||
To reduce downtime, the services can be upgraded in a rolling fashion. It means
|
||||
upgrading one or a few services at a time. To minimise downtime you need to
|
||||
have HA ironic deployment (at least two ironic-api and two ironic-conductor
|
||||
services) so that when a service instance is being upgraded, the other
|
||||
instances are still running.
|
||||
|
||||
**New features should not be used until after the upgrade has been completed.**
|
||||
.. warning::
|
||||
New features and/or new API versions should not be used until after the upgrade
|
||||
has been completed.
|
||||
|
||||
Before maintenance window
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
@ -183,23 +214,25 @@ Before maintenance window
|
||||
database schema changes are done in a way that both the old and new (N and
|
||||
N+1) releases can perform operations against the same schema.
|
||||
|
||||
.. note:: Ironic bases its RPC and object storage format versions on the
|
||||
``[DEFAULT]/pin_release_version`` configuration option. It is
|
||||
advisable to automate the deployment of changes in configuration
|
||||
files to make the process less error prone and repeatable.
|
||||
.. note::
|
||||
Ironic bases its RPC and object storage format versions on the
|
||||
``[DEFAULT]/pin_release_version`` configuration option. It is
|
||||
advisable to automate the deployment of changes in configuration
|
||||
files to make the process less error prone and repeatable.
|
||||
|
||||
During maintenance window
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
#. ironic-conductor services should be upgraded first. Ensure that at least
|
||||
one ironic-conductor service is running at all times. For every
|
||||
#. All ironic-conductor services should be upgraded first. Ensure that at
|
||||
least one ironic-conductor service is running at all times. For every
|
||||
ironic-conductor, either one by one or a few at a time:
|
||||
|
||||
* shut down the service. Conductors are load-balanced by the message queue,
|
||||
* shut down the service. Messages from the ironic-api services to the
|
||||
conductors are load-balanced by the message queue and a hash-ring,
|
||||
so the only thing you need to worry about is to shut the service down
|
||||
gracefully (using ``SIGTERM`` signal) to make sure it will finish all the
|
||||
requests being processed before shutting down
|
||||
* upgrade the code and dependencies
|
||||
requests being processed before shutting down.
|
||||
* upgrade the installed version of ironic and dependencies
|
||||
* set the ``[DEFAULT]/pin_release_version`` configuration option value to
|
||||
the version you are upgrading from (that is, the old version). Based on
|
||||
this setting, the new ironic-conductor services will downgrade any
|
||||
@ -210,15 +243,15 @@ During maintenance window
|
||||
|
||||
#. The next service to upgrade is ironic-api. Ensure that at least one
|
||||
ironic-api service is running at all times. You may want to start another
|
||||
instance of the older ironic-api to handle the load while you are upgrading
|
||||
the original ironic-api services. For every ironic-api service, either one
|
||||
by one or a few at a time:
|
||||
temporary instance of the older ironic-api to handle the load while you are
|
||||
upgrading the original ironic-api services. For every ironic-api service,
|
||||
either one by one or a few at a time:
|
||||
|
||||
* in HA deployment you are typically running them behind a load balancer
|
||||
(for example HAProxy), so you need to take the service instance out of the
|
||||
balancer
|
||||
* shut it down
|
||||
* upgrade the code and dependencies
|
||||
* upgrade the installed version of ironic and dependencies
|
||||
* set the ``[DEFAULT]/pin_release_version`` configuration option value to
|
||||
the version you are upgrading from (that is, the old version). Based on
|
||||
this setting, the new ironic-api services will downgrade any RPC
|
||||
@ -258,7 +291,7 @@ release.
|
||||
performed.
|
||||
|
||||
* Upgrade ``python-ironicclient`` along with other services connecting
|
||||
to the Bare Metal service as a client, such as nova-compute.
|
||||
to the Bare Metal service as a client, such as ``nova-compute``.
|
||||
|
||||
* Run the ``ironic-dbsync online_data_migrations`` command to make sure
|
||||
that data migrations are applied. The command lets you limit
|
||||
@ -266,11 +299,12 @@ release.
|
||||
limits the number of migrations executed in one run. You should complete
|
||||
all of the migrations as soon as possible after the upgrade.
|
||||
|
||||
Note that you will not be able to start an upgrade to the next release after
|
||||
this one, until this has been completed for the current release. For example,
|
||||
as part of upgrading from Ocata to Pike, you need to complete Pike's data
|
||||
migrations. If this not done, you will not be able to upgrade to Queens --
|
||||
it will not be possible to execute Queens' database schema updates.
|
||||
.. warning::
|
||||
Note that you will not be able to start an upgrade to the next release after
|
||||
this one, until this has been completed for the current release. For example,
|
||||
as part of upgrading from Ocata to Pike, you need to complete Pike's data
|
||||
migrations. If this not done, you will not be able to upgrade to Queens --
|
||||
it will not be possible to execute Queens' database schema updates.
|
||||
|
||||
Upgrading from Ocata to Pike
|
||||
============================
|
||||
|
Loading…
x
Reference in New Issue
Block a user