This change enables the use of Docker healthchecks for core OpenStack
services.
Also check-failures.sh has been updated to treat containers with
unhealthy status as failed.
Implements: blueprint container-health-check
Change-Id: I79c6b11511ce8af70f77e2f6a490b59b477fefbb
Keepalived and haproxy cooperate to provide control plane HA in
kolla-ansible deployments.
Certain care should be exerted to avoid prolonged availability
loss during reconfigurations and upgrades.
This patch aims to provide this care.
There is nothing special about keepalived upgrade compared to
reconfig, hence it is simplified to run the same code as for
deploy.
The broken logic of safe upgrade is replaced by common handler
code which's goal is to ensure we down current master only after
we have backups ready.
This change introduces a switch to kolla_docker module that allows
to ignore missing containers (as they are logically stopped).
ignore_missing is the switch's name.
All tests are included.
Change-Id: I22ddec5f7ee4a7d3d502649a158a7e005fe29c48
this patchset has implemented:
- network (lb-mgmt-net)
- security groups and rules (used by amphora and health manager)
- amphora flavor (used by amphora)
- nova keypair (used by amphora at the time of debugging)
Add a octavia_amp_listen_port variable which used by amphora
Add amp_image_owner_id in octavia.conf
Implements: blueprint implement-automatic-deploy-of-octavia
Co-Authored-By: zhangchun <zhangchun@yovole.com>
Depends-On: https://review.opendev.org/652030
Change-Id: I67009d046925cfc02c1e0073c80085c1471975f6
keystone-startup.sh is using fernet_token_expiry instead of
fernet_key_rotation_interval - which effects in restart loop of keystone
containers - when restarted after 2-3 days.
Closes-Bug: #1895723
Change-Id: Ifff77af3d25d9dc659fff34f2ae3c6f2670df0f4
This patch introduces an optional backend encryption for the Ironic API
service. When used in conjunction with enabling TLS for service API
endpoints, network communcation will be encrypted end to end, from
client through HAProxy to the Ironic service.
Change-Id: I9edf7545c174ca8839ceaef877bb09f49ef2b451
Partially-Implements: blueprint add-ssl-internal-network
If the common role is executed against a set of hosts that are not all
in the fluentd group, the run_once tasks that find customisations may be
skipped. This causes a later failure when accessing the registered
variables for those tasks.
This issue was raised on the mailing list:
http://lists.openstack.org/pipermail/openstack-discuss/2020-September/016932.html
This issue only affects the master branch, due to addition of groups
for the common role in I6a4676bf6efeebc61383ec7a406db07c7a868b2a.
This change fixes the issue by always running the find tasks, if fluentd
is enabled.
Change-Id: I559c4b94d18c7f36d43e1d88629ed44668abf859
When the internal VIP is moved in the event of a failure of the active
controller, OpenStack services can become unresponsive as they try to
talk with MariaDB using connections from the SQLAlchemy pool.
It has been argued that OpenStack doesn't really need to use connection
pooling with MariaDB [1]. This commit reduces the use of connection
pooling via two configuration options:
- max_pool_size is set to 1 to allow only a single connection in the
pool (it is not possible to disable connection pooling entirely via
oslo.db, and max_pool_size = 0 means unlimited pool size)
- lower connection_recycle_time from the default of one hour to 10
seconds, which means the single connection in the pool will be
recreated regularly
These settings have shown better reactivity of the system in the event
of a failover.
[1] http://lists.openstack.org/pipermail/openstack-dev/2015-April/061808.html
Change-Id: Ib6a62d4428db9b95569314084090472870417f3d
Closes-Bug: #1896635
This allows for more config flexibility - e.g. running multiple
backends with a common frontend.
Note this is a building block for future work on letsencrypt
validator (which should offer backend and share frontend with
any service running off 80/443 - which would be only horizon
in the current default config), as well as any work towards
single port (that is single frontend) and multiple services
anchored at paths of it (which is the new recommended default).
Change-Id: Ie088fcf575e4b5e8775f1f89dd705a275725e26d
Partially-Implements: blueprint letsencrypt-https
This allows for more config flexibility - e.g. running multiple
backends with a common frontend.
It is not possible with the 'listen' approach (which enforces
frontend).
Additionally, it does not really make sense to support two ways
to do the exact same thing as the process is automated and
'listen' is really meant for humans not willing to write separate
sections.
Hence this deprecates 'listen' variant.
At the moment both templates work exactly the same.
The real flexibility comes in following patches.
Note this is a building block for future work on letsencrypt
validator (which should offer backend and share frontend with
any service running off 80/443 - which would be only horizon
in the current default config), as well as any work towards
single port (that is single frontend) and multiple services
anchored at paths of it (which is the new recommended default).
Change-Id: I2362aaa3e8069fe146d42947b8dddf49376174b5
Partially-Implements: blueprint letsencrypt-https
Currently there is no option to set container_proxy only for one service
(e.g. magnum). This change adds this option.
Change-Id: Ia938ee660ebe8ce84321f721b6292b0b58a06e20
This change adds support for encryption of communication between
OpenStack services and RabbitMQ. Server certificates are supported, but
currently client certificates are not.
The kolla-ansible certificates command has been updated to support
generating certificates for RabbitMQ for development and testing.
RabbitMQ TLS is enabled in the all-in-one source CI jobs, or when
The Zuul 'tls_enabled' variable is true.
Change-Id: I4f1d04150fb2b5af085b762890092f87ae6076b5
Implements: blueprint message-queue-ssl-support
Since change [1] merged we have two mariadb images (mariadb and mariadb-server)
Let's use mariadb-server in kolla-ansible, so we can deprecate mariadb image.
[1]: https://review.opendev.org/#/c/710217/
Change-Id: I4ae2ccaaba8fb516f469f4ce8628e8c61de03f0d
replace 'openstack aggregate create' command with ansible
os_nova_host_aggregate module and remove TODO
Change-Id: I727f9e4acc9e22f59735c65190ac38cc75e5f781
This reverts commit 316b0496b3dd7a9b33692b171391d9d17d535116, because
ironic-inspector is not ready to use WSGI. It would need to be split
into two separate containers, one running ironic-inspector-api-wsgi and
another running ironic-inspector-conductor.
Change-Id: I7e6c59dc8ad4fdee0cc6d96313fe66bc1d001bf7
The Prometheus OpenStack exporter was needlessly configured to use the
prometheus Docker volume and change permissions of /data, which does
not exist in the container image.
This must have been copy-pasted from existing Prometheus code.
Change-Id: I96017c17e68ca7a00a2d5ac41f2f43ef87694514
This patch introduces an optional backend encryption for the Ironic API
and Ironic Inspector service. When used in conjunction with enabling
TLS for service API endpoints, network communcation will be encrypted
end to end, from client through HAProxy to the Ironic service.
Change-Id: I3e82c8ec112e53f907e89fea0c8c849072dcf957
Partially-Implements: blueprint add-ssl-internal-network
Depends-On: https://review.opendev.org/#/c/742776/
Including tasks has a performance penalty when compared with importing
tasks. If the include has a condition associated with it, then the
overhead of the include may be lower than the overhead of skipping all
imported tasks. In the case of the register.yml and bootstrap.yml
includes, all of the tasks in the included file use run_once: True.
The run_once flag improves performance at scale drastically, so
importing these tasks unconditionally will have a lower overhead than a
conditional include task. It therefore makes sense to switch to use
import_tasks there.
See [1] for benchmarks of run_once.
[1] https://github.com/stackhpc/ansible-scaling/blob/master/doc/run-once.md
Change-Id: Ic67631ca3ea3fb2081a6f8978e85b1522522d40d
Partially-Implements: blueprint performance-improvements
Including tasks has a performance penalty when compared with importing
tasks. The nova-cell role uses include_tasks twice when generating
certificates and keys for libvirt TLS. While a dynamic include makes
sense here for a non-default feature, we can use one include rather than
two with the same effect. Since this task runs against compute nodes the
overhead is significant.
See [1] for benchmarks of include_tasks and import_tasks.
[1] https://github.com/stackhpc/ansible-scaling/blob/master/doc/include-and-import.md
Partially-Implements: blueprint performance-improvements
Change-Id: Ic687d2f7d4625aede386e576ebb174da72142756