5610 Commits

Author SHA1 Message Date
Zuul
5e638b757b Merge "Use Docker healthchecks for core services" 2020-10-06 08:26:21 +00:00
Michal Nasiadka
a220c81fb4 horizon: stop using deprecated django.py
[1]: https://review.opendev.org/#/c/561802/

Change-Id: Id335502ad464aa417162b2576ffae3818d30cba1
2020-10-05 12:46:49 +02:00
Michal Nasiadka
c52a89ae04 Use Docker healthchecks for core services
This change enables the use of Docker healthchecks for core OpenStack
services.
Also check-failures.sh has been updated to treat containers with
unhealthy status as failed.

Implements: blueprint container-health-check
Change-Id: I79c6b11511ce8af70f77e2f6a490b59b477fefbb
2020-10-05 08:35:47 +00:00
Radosław Piliszek
c2d0bf30ea Coordinate haproxy and keepalived restarts
Keepalived and haproxy cooperate to provide control plane HA in
kolla-ansible deployments.
Certain care should be exerted to avoid prolonged availability
loss during reconfigurations and upgrades.
This patch aims to provide this care.
There is nothing special about keepalived upgrade compared to
reconfig, hence it is simplified to run the same code as for
deploy.
The broken logic of safe upgrade is replaced by common handler
code which's goal is to ensure we down current master only after
we have backups ready.

This change introduces a switch to kolla_docker module that allows
to ignore missing containers (as they are logically stopped).
ignore_missing is the switch's name.
All tests are included.

Change-Id: I22ddec5f7ee4a7d3d502649a158a7e005fe29c48
2020-10-04 16:58:24 +02:00
Zuul
4c4ad2b87b Merge "Implement automatic deploy of octavia" 2020-10-02 15:04:46 +00:00
wu.chunyang
4a58f4238c Implement automatic deploy of octavia
this patchset has implemented:
  - network (lb-mgmt-net)
  - security groups and rules (used by amphora and health manager)
  - amphora flavor (used by amphora)
  - nova keypair (used by amphora at the time of debugging)

Add a octavia_amp_listen_port variable which used by amphora
Add amp_image_owner_id in octavia.conf

Implements: blueprint implement-automatic-deploy-of-octavia
Co-Authored-By: zhangchun <zhangchun@yovole.com>

Depends-On: https://review.opendev.org/652030

Change-Id: I67009d046925cfc02c1e0073c80085c1471975f6
2020-10-02 14:05:00 +02:00
Zuul
586357ca74 Merge "Change the default haproxy template to split variant" 2020-10-01 12:49:02 +00:00
Radosław Piliszek
8d2d37064e Control Neutron migrations
Since [1] and [2] merged, K-A has to control Neutron migrations
to migrate all required projects.

This patch additionally fixes the other observed issue.

[1] https://review.opendev.org/750075
[2] https://review.opendev.org/753543

Change-Id: I09e1b421e9066890b50bd82331a3050de252464f
Closes-Bug: #1894380
Depends-On: https://review.opendev.org/755346
2020-10-01 10:13:19 +02:00
Zuul
4441038e29 Merge "Make keep-alive timeout configurable" 2020-09-30 17:01:56 +00:00
Zuul
ba933f16e9 Merge "Support TLS encryption of RabbitMQ client-server traffic" 2020-09-29 11:31:03 +00:00
Michal Nasiadka
883b79a1a5 [baremetal]: Use $releasever in docker-ce repo
Update to CentOS 8 versions of packages in docker-ce repo (that are
now available)

Change-Id: I50d28ea31c3c29322974b91a72a2bd7999324ac7
2020-09-28 17:27:23 +00:00
Zuul
0dd44b7675 Merge "Reduce the use of SQLAlchemy connection pooling" 2020-09-28 17:14:55 +00:00
Radosław Piliszek
2fd72a39e9 Add support for ACME http-01 challenge
All docs are included.

Change-Id: Ie29ff7ca340812c8dc0dac493518c87cf7bf137b
Partially-Implements: blueprint letsencrypt-https
2020-09-26 20:29:20 +02:00
Zuul
29b2d4284a Merge "Fix keystone-startup.sh" 2020-09-25 13:44:13 +00:00
Zuul
07cbec194f Merge "Add support for encrypting Ironic API" 2020-09-25 11:47:49 +00:00
Michal Nasiadka
d78673e77f Fix keystone-startup.sh
keystone-startup.sh is using fernet_token_expiry instead of
fernet_key_rotation_interval - which effects in restart loop of keystone
containers - when restarted after 2-3 days.

Closes-Bug: #1895723

Change-Id: Ifff77af3d25d9dc659fff34f2ae3c6f2670df0f4
2020-09-25 10:19:44 +00:00
James Kirsch
7c2df87ded Add support for encrypting Ironic API
This patch introduces an optional backend encryption for the Ironic API
service. When used in conjunction with enabling TLS for service API
endpoints, network communcation will be encrypted end to end, from
client through HAProxy to the Ironic service.

Change-Id: I9edf7545c174ca8839ceaef877bb09f49ef2b451
Partially-Implements: blueprint add-ssl-internal-network
2020-09-24 10:09:13 -07:00
Zuul
43a0a1ca3d Merge "Allow setting container_proxy per service" 2020-09-24 10:05:50 +00:00
Zuul
01a47b927d Merge "Bump minimum Ansible version to 2.9" 2020-09-24 09:40:15 +00:00
Pierre Riteau
c5c6d995d3 Bump minimum Ansible version to 2.9
Change-Id: I5befc72a4894d625ca352b27df9d3aa84a2f5b2c
2020-09-23 17:48:01 +02:00
Mark Goddard
6882013309 Fix common role when using external mariadb
If the common role is executed against a set of hosts that are not all
in the fluentd group, the run_once tasks that find customisations may be
skipped. This causes a later failure when accessing the registered
variables for those tasks.

This issue was raised on the mailing list:
http://lists.openstack.org/pipermail/openstack-discuss/2020-September/016932.html

This issue only affects the master branch, due to addition of groups
for the common role in I6a4676bf6efeebc61383ec7a406db07c7a868b2a.

This change fixes the issue by always running the find tasks, if fluentd
is enabled.

Change-Id: I559c4b94d18c7f36d43e1d88629ed44668abf859
2020-09-22 17:32:37 +01:00
Pierre Riteau
c81772024c Reduce the use of SQLAlchemy connection pooling
When the internal VIP is moved in the event of a failure of the active
controller, OpenStack services can become unresponsive as they try to
talk with MariaDB using connections from the SQLAlchemy pool.

It has been argued that OpenStack doesn't really need to use connection
pooling with MariaDB [1]. This commit reduces the use of connection
pooling via two configuration options:

- max_pool_size is set to 1 to allow only a single connection in the
  pool (it is not possible to disable connection pooling entirely via
  oslo.db, and max_pool_size = 0 means unlimited pool size)
- lower connection_recycle_time from the default of one hour to 10
  seconds, which means the single connection in the pool will be
  recreated regularly

These settings have shown better reactivity of the system in the event
of a failover.

[1] http://lists.openstack.org/pipermail/openstack-dev/2015-April/061808.html

Change-Id: Ib6a62d4428db9b95569314084090472870417f3d
Closes-Bug: #1896635
2020-09-22 17:54:45 +02:00
Radosław Piliszek
3916c156be Add support for with_frontend and with_backend
This allows for more config flexibility - e.g. running multiple
backends with a common frontend.

Note this is a building block for future work on letsencrypt
validator (which should offer backend and share frontend with
any service running off 80/443 - which would be only horizon
in the current default config), as well as any work towards
single port (that is single frontend) and multiple services
anchored at paths of it (which is the new recommended default).

Change-Id: Ie088fcf575e4b5e8775f1f89dd705a275725e26d
Partially-Implements: blueprint letsencrypt-https
2020-09-22 17:26:42 +02:00
Radosław Piliszek
9451ac61a0 Change the default haproxy template to split variant
This allows for more config flexibility - e.g. running multiple
backends with a common frontend.
It is not possible with the 'listen' approach (which enforces
frontend).
Additionally, it does not really make sense to support two ways
to do the exact same thing as the process is automated and
'listen' is really meant for humans not willing to write separate
sections.
Hence this deprecates 'listen' variant.

At the moment both templates work exactly the same.
The real flexibility comes in following patches.

Note this is a building block for future work on letsencrypt
validator (which should offer backend and share frontend with
any service running off 80/443 - which would be only horizon
in the current default config), as well as any work towards
single port (that is single frontend) and multiple services
anchored at paths of it (which is the new recommended default).

Change-Id: I2362aaa3e8069fe146d42947b8dddf49376174b5
Partially-Implements: blueprint letsencrypt-https
2020-09-22 16:49:36 +02:00
Radosław Piliszek
a45ef7ccaa Fix default mode in haproxy_single_service_split
haproxy_single_service_listen (the default template) was already fine.

Closes-Bug: #1896591
TrivialFix

Change-Id: Id68fe19ea87565aa36fb74f2a2ca66cb951169f6
2020-09-22 11:58:38 +02:00
Michal Nasiadka
f257e79aff Allow setting container_proxy per service
Currently there is no option to set container_proxy only for one service
(e.g. magnum). This change adds this option.

Change-Id: Ia938ee660ebe8ce84321f721b6292b0b58a06e20
2020-09-22 10:54:40 +02:00
Radosław Piliszek
bce266201b Allow to skip and unset sysctl vars
via KOLLA_SKIP and KOLLA_UNSET

Change-Id: I7d9af21c2dd8c303066eb1ee4dff7a72bca24283
Related-Bug: #1837551
2020-09-21 13:13:58 +02:00
Radosław Piliszek
6be51fa67a Add support for changing sysctl.conf path
via kolla_sysctl_conf_path

Change-Id: I09b20fa008a7fecedcb599b4792f24215179b853
2020-09-21 11:47:05 +02:00
Zuul
cccfa8f378 Merge "Fix glance-tls-proxy logrotate and fluentd log permissions" 2020-09-21 09:04:53 +00:00
wu.chunyang
88de8feb7b replace internal with openstack_interface
replace harcode 'internal' with {{ openstack_interface }}

Change-Id: I885622967ffde2a7a1a08fedbde2eb0e4e330e22
2020-09-18 21:42:52 +08:00
Michal Nasiadka
aed9f84fe9 Fix glance-tls-proxy logrotate and fluentd log permissions
Change-Id: Iabc0115d3476a626df134cc70cb473bf6e72487e
Closes-Bug: #1890439
2020-09-18 08:51:36 +00:00
Zuul
91f5861769 Merge "Support neutron_sriov_physnet_mappings to support multiple devices" 2020-09-17 16:53:26 +00:00
Zuul
90e4795f50 Merge "Change mariadb image to mariadb-server" 2020-09-17 16:53:21 +00:00
Bharat Kunwar
c24a280bee Support neutron_sriov_physnet_mappings to support multiple devices
Change-Id: Ifcedcc72307732393a92a702a7567addc043b5b2
2020-09-17 13:26:30 +00:00
Mark Goddard
761ea9a333 Support TLS encryption of RabbitMQ client-server traffic
This change adds support for encryption of communication between
OpenStack services and RabbitMQ. Server certificates are supported, but
currently client certificates are not.

The kolla-ansible certificates command has been updated to support
generating certificates for RabbitMQ for development and testing.

RabbitMQ TLS is enabled in the all-in-one source CI jobs, or when
The Zuul 'tls_enabled' variable is true.

Change-Id: I4f1d04150fb2b5af085b762890092f87ae6076b5
Implements: blueprint message-queue-ssl-support
2020-09-17 12:05:44 +01:00
Zuul
127c3072b4 Merge "replace openstackclient with ansible module" 2020-09-17 11:03:28 +00:00
Zuul
fbef9b36d6 Merge "Performance: use a single config file for fluentd" 2020-09-17 11:03:26 +00:00
Zuul
bc388d5657 Merge "Performance: use a single config file for logrotate" 2020-09-17 10:55:01 +00:00
Michal Nasiadka
a7941e2498 Change mariadb image to mariadb-server
Since change [1] merged we have two mariadb images (mariadb and mariadb-server)
Let's use mariadb-server in kolla-ansible, so we can deprecate mariadb image.

[1]: https://review.opendev.org/#/c/710217/

Change-Id: I4ae2ccaaba8fb516f469f4ce8628e8c61de03f0d
2020-09-17 10:42:21 +00:00
wu.chunyang
0bb16b52f6 replace openstackclient with ansible module
replace 'openstack aggregate create' command with ansible
os_nova_host_aggregate module and remove TODO

Change-Id: I727f9e4acc9e22f59735c65190ac38cc75e5f781
2020-09-17 11:41:27 +08:00
Pierre Riteau
3d30624cc1 Revert "Add support for encrypting Ironic API"
This reverts commit 316b0496b3dd7a9b33692b171391d9d17d535116, because
ironic-inspector is not ready to use WSGI. It would need to be split
into two separate containers, one running ironic-inspector-api-wsgi and
another running ironic-inspector-conductor.

Change-Id: I7e6c59dc8ad4fdee0cc6d96313fe66bc1d001bf7
2020-09-10 15:26:06 +00:00
Zuul
ec34132b25 Merge "Synchronize REST_API_REQUIRED_SETTINGS with Horizon" 2020-09-09 09:17:35 +00:00
Zuul
f10b5336cc Merge "Set neutron-ovn-metadata-agent metadata_workers to 2" 2020-09-09 09:03:59 +00:00
Zuul
b9fd7d8e92 Merge "Add support for encrypting Ironic API" 2020-09-08 08:50:50 +00:00
Michal Nasiadka
dcc417dbec Set neutron-ovn-metadata-agent metadata_workers to 2
As per [1] and [2] - it solves a problem, where neutron-ovn-metadata-agent will
spawn high number of workers (defaults to half number of CPUs).

[1]: http://lists.openstack.org/pipermail/openstack-discuss/2020-September/016960.html
[2]: https://bugs.launchpad.net/neutron/+bug/1893656

Change-Id: Id69f9399fe76ff7c4e2e17b5ab5ec7df1a01c5c9
2020-09-07 10:22:57 +00:00
Pierre Riteau
295f8d1b43 Remove unused configuration for prometheus-openstack-exporter
The Prometheus OpenStack exporter was needlessly configured to use the
prometheus Docker volume and change permissions of /data, which does
not exist in the container image.

This must have been copy-pasted from existing Prometheus code.

Change-Id: I96017c17e68ca7a00a2d5ac41f2f43ef87694514
2020-09-01 14:15:52 +02:00
Zuul
3316daad83 Merge "Performance: use import_tasks for register and bootstrap" 2020-08-31 11:30:59 +00:00
James Kirsch
316b0496b3 Add support for encrypting Ironic API
This patch introduces an optional backend encryption for the Ironic API
and Ironic Inspector service. When used in conjunction with enabling
TLS for service API endpoints, network communcation will be encrypted
end to end, from client through HAProxy to the Ironic service.

Change-Id: I3e82c8ec112e53f907e89fea0c8c849072dcf957
Partially-Implements: blueprint add-ssl-internal-network
Depends-On: https://review.opendev.org/#/c/742776/
2020-08-29 15:25:49 +00:00
Mark Goddard
496904d650 Performance: use import_tasks for register and bootstrap
Including tasks has a performance penalty when compared with importing
tasks. If the include has a condition associated with it, then the
overhead of the include may be lower than the overhead of skipping all
imported tasks. In the case of the register.yml and bootstrap.yml
includes, all of the tasks in the included file use run_once: True.
The run_once flag improves performance at scale drastically, so
importing these tasks unconditionally will have a lower overhead than a
conditional include task.  It therefore makes sense to switch to use
import_tasks there.

See [1] for benchmarks of run_once.

[1] https://github.com/stackhpc/ansible-scaling/blob/master/doc/run-once.md

Change-Id: Ic67631ca3ea3fb2081a6f8978e85b1522522d40d
Partially-Implements: blueprint performance-improvements
2020-08-28 16:31:04 +00:00
Mark Goddard
3c02c966cb Performance: remove one include_tasks in nova-cell
Including tasks has a performance penalty when compared with importing
tasks. The nova-cell role uses include_tasks twice when generating
certificates and keys for libvirt TLS. While a dynamic include makes
sense here for a non-default feature, we can use one include rather than
two with the same effect. Since this task runs against compute nodes the
overhead is significant.

See [1] for benchmarks of include_tasks and import_tasks.

[1] https://github.com/stackhpc/ansible-scaling/blob/master/doc/include-and-import.md

Partially-Implements: blueprint performance-improvements

Change-Id: Ic687d2f7d4625aede386e576ebb174da72142756
2020-08-28 16:16:56 +00:00