620 Commits

Author SHA1 Message Date
Radosław Piliszek
099a33c87d [train] Finish configuring Zun to use Placement
This also enables Placement when Zun is enabled like Kolla Ansible
already does with Nova.

Change-Id: Id2a09f702e8503b49d2b9e73e06b2ce9f4d168a9
Closes-bug: #1840573
2019-10-20 19:33:56 +02:00
Radosław Piliszek
4d398f4b7f Fix placement being enabled always instead of with nova
Adds "| bool".

Backportable to Stein.

Change-Id: Ifa2aa387be46beb6da1d3c5a5e0da1b561af8cee
Closes-bug: #1848937
2019-10-20 19:30:21 +02:00
Jan Vondra
e54edb55e4 Neutron: add support to use legacy iptables
neutron_legacy_iptables option sets the KOLLA_LEGACY_IPTABLES
environment variable in the neutron-l3-agent, neutron-linuxbridge-agent
and neutron_openvswich_agent container where it should be consumed
by kolla_extended_start script resulting in setting iptables-legacy.

Depends-On: https://review.opendev.org/#/c/683679/
Change-Id: Iaa8b46a2227b61a729b8d54bbe4b20f389f251d1
2019-10-17 09:42:00 +00:00
Doug Szumski
78a828ef42 Support multiple nova cells
This patch adds initial support for deploying multiple Nova cells.

Splitting a nova-cell role out from the Nova role allows a more granular
approach to deploying and configuring Nova services.

A new enable_cells flag has been added that enables the support of
multiple cells via the introduction of a super conductor in addition to
cell-specific conductors. When this flag is not set (the default), nova
is configured in the same manner as before - with a single conductor.

The nova role now deploys the global services:

* nova-api
* nova-scheduler
* nova-super-conductor (if enable_cells is true)

The nova-cell role handles services specific to a cell:

* nova-compute
* nova-compute-ironic
* nova-conductor
* nova-libvirt
* nova-novncproxy
* nova-serialproxy
* nova-spicehtml5proxy
* nova-ssh

This patch does not support using a single cell controller for managing
more than one cell. Support for sharing a cell controller will be added
in a future patch.

This patch should be backwards compatible and is tested by existing CI
jobs. A new CI job has been added that tests a multi-cell environment.

ceph-mon has been removed from the play hosts list as it is not
necessary - delegate_to does not require the host to be in the play.

Documentation will be added in a separate patch.

Partially Implements: blueprint support-nova-cells
Co-Authored-By: Mark Goddard <mark@stackhpc.com>
Change-Id: I810aad7d49db3f5a7fd9a2f0f746fd912fe03917
2019-10-16 17:42:36 +00:00
Radosław Piliszek
bc053c09c1 Implement IPv6 support in the control plane
Introduce kolla_address filter.
Introduce put_address_in_context filter.

Add AF config to vars.

Address contexts:
- raw (default): <ADDR>
- memcache: inet6:[<ADDR>]
- url: [<ADDR>]

Other changes:

globals.yml - mention just IP in comment

prechecks/port_checks (api_intf) - kolla_address handles validation

3x interface conditional (swift configs: replication/storage)

2x interface variable definition with hostname
(haproxy listens; api intf)

1x interface variable definition with hostname with bifrost exclusion
(baremetal pre-install /etc/hosts; api intf)

neutron's ml2 'overlay_ip_version' set to 6 for IPv6 on tunnel network

basic multinode source CI job for IPv6

prechecks for rabbitmq and qdrouterd use proper NSS database now

MariaDB Galera Cluster WSREP SST mariabackup workaround
(socat and IPv6)

Ceph naming workaround in CI
TODO: probably needs documenting

RabbitMQ IPv6-only proto_dist

Ceph ms switch to IPv6 mode

Remove neutron-server ml2_type_vxlan/vxlan_group setting
as it is not used (let's avoid any confusion)
and could break setups without proper multicast routing
if it started working (also IPv4-only)

haproxy upgrade checks for slaves based on ipv6 addresses

TODO:

ovs-dpdk grabs ipv4 network address (w/ prefix len / submask)
not supported, invalid by default because neutron_external has no address
No idea whether ovs-dpdk works at all atm.

ml2 for xenapi
Xen is not supported too well.
This would require working with XenAPI facts.

rp_filter setting
This would require meddling with ip6tables (there is no sysctl param).
By default nothing is dropped.
Unlikely we really need it.

ironic dnsmasq is configured IPv4-only
dnsmasq needs DHCPv6 options and testing in vivo.

KNOWN ISSUES (beyond us):

One cannot use IPv6 address to reference the image for docker like we
currently do, see: https://github.com/moby/moby/issues/39033
(docker_registry; docker API 400 - invalid reference format)
workaround: use hostname/FQDN

RabbitMQ may fail to bind to IPv6 if hostname resolves also to IPv4.
This is due to old RabbitMQ versions available in images.
IPv4 is preferred by default and may fail in the IPv6-only scenario.
This should be no problem in real life as IPv6-only is indeed IPv6-only.
Also, when new RabbitMQ (3.7.16/3.8+) makes it into images, this will
no longer be relevant as we supply all the necessary config.
See: https://github.com/rabbitmq/rabbitmq-server/pull/1982

For reliable runs, at least Ansible 2.8 is required (2.8.5 confirmed
to work well). Older Ansible versions are known to miss IPv6 addresses
in interface facts. This may affect redeploys, reconfigures and
upgrades which run after VIP address is assigned.
See: https://github.com/ansible/ansible/issues/63227

Bifrost Train does not support IPv6 deployments.
See: https://storyboard.openstack.org/#!/story/2006689

Change-Id: Ia34e6916ea4f99e9522cd2ddde03a0a4776f7e2c
Implements: blueprint ipv6-control-plane
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-10-16 10:24:35 +02:00
Zuul
6bdd7dba75 Merge "[designate] Add coordination backend for designate workers" 2019-10-03 13:15:52 +00:00
Zuul
fc3cf24536 Merge "Add 'db=0' to redis_connection_string" 2019-10-02 10:03:15 +00:00
Joseph M
9cae608392 [designate] Add coordination backend for designate workers
Add coordination backend configuration to designate.conf which is
required in multinode environments. Fixes warning from designate:

WARNING designate.coordination [-] No coordination backend configured,
assuming we are the only worker. Please configure a coordination backend

Change-Id: I23c4d2de7e3f9368795c423000a4f9a6c3a431e2
Closes-Bug: #1843842
Related-Bug: #1840070
2019-09-30 11:02:27 -04:00
Mark Goddard
27f4876eed Switch default cloudkitty storage backend to influxdb
Backport: stein

In the Stein release, cloudkitty switched the default storage backend
from sqlalchemy to influxdb. In kolla-ansible stein configuration, we
did not explicitly set the storage backend, and so we automatically
picked up this change. However, prior to
https://review.opendev.org/#/c/615928/ we did not have full support for
InfluxDB as a storage backend, and so this has broken the Rocky-Stein
upgrade (https://bugs.launchpad.net/kolla-ansible/+bug/1838641), which
fails with this during the DB sync:

ERROR cloudkitty InfluxDBClientError: get_list_retention_policies()
requires a database as a parameter or the client to be using a database

This change synchronises our default with cloudkitty's (influxdb), and
also provides an upgrade transition to create the influxdb database.

We also move the cloudkitty_storage_backend variable to
group_vars/all.yml, since it is used to determine whether to enable
influxdb.

Finally, the section name in cloudkitty.conf was incorrect - it was
storage_influx,  but should be storage_influxdb.

Change-Id: I71f2ed11bd06f58e141d222e2709835b7ddb2c71
Closes-Bug: #1838641
2019-09-24 16:15:14 +00:00
Dincer Celik
5ff7bab46b [prometheus] Added support for extra options
This change introduces the way to pass extra options to prometheus.

Currently, prometheus runs with nearly default options, and when clouds
start getting bigger, you need to pass extra parameters to prometheus.

Change-Id: Ic773c0b73062cf3b2285343bafb25d5923911834
2019-09-23 11:25:04 +03:00
Zuul
b7bbbae981 Merge "Adding Prometheus blackbox exporter" 2019-09-20 17:25:04 +00:00
Zuul
91c68f5da8 Merge "Update "openstack_release" variable to static brach name" 2019-09-19 21:21:57 +00:00
Mark Goddard
15e35333dd Remove support for OracleLinux
We have agreed to remove support for Oracle Linux.

http://lists.openstack.org/pipermail/openstack-discuss/2019-June/006896.html

Change-Id: If11b4ff37af936a0cfd34443e8babb952307882b
2019-09-18 12:25:12 +01:00
Scott Solkhon
b22375ebfd Adding Prometheus blackbox exporter
This commit follows up the work in Kolla to provide deploy and configure the
Prometheus blackbox exporter.

An example blackbox-exporter module has been added (disabled by default)
called os_endpoint. This allows for the probing of endpoints over HTTP
and HTTPS. This can be used to monitor that OpenStack endpoints return a status
code of either 200 or 300, and the word 'versions' in the payload.

This change introduces a new variable `prometheus_blackbox_exporter_endpoints`.
Currently no defaults are specified because the configuration is heavily
dependent on the deployment.

Co-authored-by: Jack Heskett <Jack.Heskett@gresearch.co.uk>
Change-Id: I36ad4961078d90e2fd70c9a3368f5157d6fd89cd
2019-09-18 11:06:19 +01:00
chenxing
4eceb48d2d Update "openstack_release" variable to static brach name
Since we use the release name as the default tag to publish images
to Dockerhub, we should use this by default.

This change also removes support for the magic value "auto".

Change-Id: I5610cc7729e9311709147ba5532199a033dfd156
Closes-Bug: #1843518
2019-09-16 12:42:44 +00:00
Zuul
d659c4dd15 Merge "Sync enable flags in globals.yml" 2019-09-14 16:20:33 +00:00
Zuul
5dae45e26e Merge "Enable Swift Recon" 2019-09-12 14:06:15 +00:00
Zuul
b8de3da287 Merge "Add a explanatory note for "placement_api_port"" 2019-09-12 14:02:17 +00:00
Mark Goddard
fd1fcdc465 Sync enable flags in globals.yml
Change-Id: I593b06c447d156c7a981d1c617f4f9baa82884de
Closes-Bug: #1841175
2019-09-12 14:19:44 +01:00
Scott Solkhon
d463d3f7bf Enable Swift Recon
This commit adds the necessary configuration to the Swift account,
container and object configuration files to enable the Swift recon
cli.

In order to give the object server on each Swift host access to the
recon files, a Docker volume is mounted into each container which
generates them. The volume is then mounted read only into the object
server container. Note that multiple containers append to the same
file. This should not be a problem since Swift uses a lock when
appending.

Change-Id: I343d8f45a78ebc3c11ed0c68fe8bec24f9ea7929
Co-authored-by: Doug Szumski <doug@stackhpc.com>
2019-09-12 11:45:02 +01:00
Zuul
ff86c2f2e3 Merge "Implement TLS encryption for internal endpoints" 2019-09-12 09:20:54 +00:00
pangliye
df6b98d793 Delete influxdb admin port
From version 1.3, the web admin interface is no longer available
in InfluxDB.
https://docs.influxdata.com/influxdb/v1.3/administration/differences/#web-admin-ui-removal

Change-Id: I1dce61a9c40a407882cfcd520ca491b4dee734ae
2019-09-11 09:27:08 +08:00
Zuul
21f22a6da9 Merge "Fix misspell word" 2019-09-09 14:56:38 +00:00
Q.hongtao
dd6a9d7d9f Fix misspell word
Change-Id: I124cba4bfe85e76f732ae618619594004a5c911f
2019-09-06 16:11:17 +08:00
Marcin Juszkiewicz
a5808ad8ba Modernize the way of configuring Docker daemon
Instead of changing Docker daemon command line let's change config
for Docker instead. In /etc/docker/daemon.json file as it should be.

Custom Docker options can be set with 'docker_custom_config' variable.

Old 'docker_custom_option' is still present but should be avoided.

Co-Authored-By: Radosław Piliszek <radoslaw.piliszek@gmail.com>
Change-Id: I1215e04ec15b01c0b43bac8c0e81293f6724f278
2019-09-05 08:19:26 +00:00
Manuel Rodriguez
1662a77b55 Add support to enable l3 port-forwarding plugin
Allows enabling neutron port forwarding plugin
and l3 extension to forward ports from floating
IP to a fixed neutron port.

Change-Id: Ic25c96a0ddcf4f69acbfb7a58acafec82c3b0aed
Implements: blueprint enable-l3-port-forwarding
2019-09-02 16:28:51 -04:00
Will Szumski
94d824dd0e Use secure websocket for nova serial console proxy when TLS enabled
This resolves an issue where the web browser would complain that it
was trying to connect to insecure websocket when using HTTPS with
horizon.

Change-Id: Ib75cc2bc1b3811bc31badd5fda3db3ed0c59b119
Closes-Bug: #1841914
2019-08-29 11:02:28 +01:00
Zuul
42aef5a50f Merge "Support configuration of trusted CA certificate file" 2019-08-28 07:48:51 +00:00
Zuul
e8f17f5b7a Merge "Set default timeout to 60 seconds for docker stop" 2019-08-27 12:42:43 +00:00
Krzysztof Klimonda
b0ecd8b67c Implement TLS encryption for internal endpoints
This review is the first one in a series of patches and it introduces an
optional encryption for internal openstack endpoints, implementing part
of the add-ssl-internal-network spec.

Change-Id: I6589751626486279bf24725f22e71da8cd7f0a43
2019-08-22 16:39:21 -07:00
Mark Goddard
331d373b99 Don't assume etcd group exists in baremetal role
The baremetal role does not currently assume too much about the
inventory, and in kayobe the seed is deployed using a very minimal
inventory.

Icf3f01516185afb7b9f642407b06a0204c36ecbe added a reference to the etcd
group in the baremetal role, which causes kayobe seed deployment to fail
with the following error:

    AnsibleUndefinedVariable: 'dict object' has no attribute 'etcd'

This change defaults the group lookup to an empty list.

Change-Id: Ib3252143a97652c5cf70b56cbfd7c7ce69f93a55
Closes-Bug: #1841073
2019-08-22 16:30:56 +01:00
Mark Goddard
33efcb814c Set default timeout to 60 seconds for docker stop
The previous default timeout was 10 seconds, which does not always
allow services enough time to shut down safely.

Change-Id: I54eff91567108a7e5d99f067829ae4a6900cd859
2019-08-19 11:54:14 +01:00
Zuul
5394cf187d Merge "Allow to configure docker for Zun" 2019-08-17 11:58:45 +00:00
Zuul
58cca6801c Merge "Allow cinder coordination backend to be configured" 2019-08-16 16:06:01 +00:00
Radosław Piliszek
44f88d16ac Allow to configure docker for Zun
Change-Id: Icf3f01516185afb7b9f642407b06a0204c36ecbe
Closes-Bug: #1840315
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-08-16 15:35:11 +02:00
Scott Solkhon
09e02ef8f1 Support configuration of trusted CA certificate file
This commit adds the functionality for an operator to specify
their own trusted CA certificate file for interacting with the
Keystone API.

Implements: blueprint support-trusted-ca-certificate-file
Change-Id: I84f9897cc8e107658701fb309ec318c0f805883b
2019-08-16 12:47:42 +00:00
Zuul
aa135e37f7 Merge "Standardize the configuration of "oslo_messaging" section" 2019-08-15 20:04:56 +00:00
Zuul
bf372c2502 Merge "Add Masakari Ansible role" 2019-08-15 16:36:44 +00:00
Rafael Weingärtner
22a6223b1b Standardize the configuration of "oslo_messaging" section
After all of the discussions we had on
"https://review.opendev.org/#/c/670626/2", I studied all projects that
have an "oslo_messaging" section. Afterwards, I applied the same method
that is already used in "oslo_messaging" section in Nova, Cinder, and
others. This guarantees that we have a consistent method to
enable/disable notifications across projects based on components (e.g.
Ceilometer) being enabled or disabled. Here follows the list of
components, and the respective changes I did.

* Aodh:
The section is declared, but it is not used. Therefore, it will
be removed in an upcomming PR.

* Congress:
The section is declared, but it is not used. Therefore, it will
be removed in an upcomming PR.

* Cinder:
It was already properly configured.

* Octavia:
The section is declared, but it is not used. Therefore, it will
be removed in an upcomming PR.

* Heat:
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

* Ceilometer:
Ceilometer publishes some messages in the rabbitMQ. However, the
default driver is "messagingv2", and not ''(empty) as defined in Oslo;
these configurations are defined in ceilometer/publisher/messaging.py.
Therefore, we do not need to do anything for the
"oslo_messaging_notifications" section in Ceilometer

* Tacker:
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

* Neutron:
It was already properly configured.

* Nova
It was already properly configured. However, we found another issue
with its configuration. Kolla-ansible does not configure nova
notifications as it should. If 'searchlight' is not installed (enabled)
the 'notification_format' should be 'unversioned'. The default is
'both'; so nova will send a notification to the queue
versioned_notifications; but that queue has no consumer when
'searchlight' is disabled. In our case, the queue got 511k messages.
The huge amount of "stuck" messages made the Rabbitmq cluster
unstable.

https://bugzilla.redhat.com/show_bug.cgi?id=1478274
https://bugs.launchpad.net/ceilometer/+bug/1665449

* Nova_hyperv:
I added the same configurations as in Nova project.

* Vitrage
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

* Searchlight
I created a mechanism similar to what we have in AODH, Cinder, Nova,
and others.

* Ironic
I created a mechanism similar to what we have in AODH, Cinder, Nova,
and others.

* Glance
It was already properly configured.

* Trove
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

* Blazar
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

* Sahara
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

* Watcher
I created a mechanism similar to what we have in AODH, Cinder, Nova,
and others.

* Barbican
I created a mechanism similar to what we have in Cinder, Nova,
and others. I also added a configuration to 'keystone_notifications'
section. Barbican needs its own queue to capture events from Keystone.
Otherwise, it has an impact on Ceilometer and other systems that are
connected to the "notifications" default queue.

* Keystone
Keystone is the system that triggered this work with the discussions
that followed on https://review.opendev.org/#/c/670626/2. After a long
discussion, we agreed to apply the same approach that we have in Nova,
Cinder and other systems in Keystone. That is what we did. Moreover, we
introduce a new topic "barbican_notifications" when barbican is
enabled. We also removed the "variable" enable_cadf_notifications, as
it is obsolete, and the default in Keystone is CADF.

* Mistral:
It was hardcoded "noop" as the driver. However, that does not seem a
good practice. Instead, I applied the same standard of using the driver
and pushing to "notifications" queue if Ceilometer is enabled.

* Cyborg:
I created a mechanism similar to what we have in AODH, Cinder, Nova,
and others.

* Murano
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

* Senlin
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

* Manila
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

* Zun
The section is declared, but it is not used. Therefore, it will
be removed in an upcomming PR.

* Designate
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

* Magnum
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

Closes-Bug: #1838985

Change-Id: I88bdb004814f37c81c9a9c4e5e491fac69f6f202
Signed-off-by: Rafael Weingärtner <rafael@apache.org>
2019-08-15 13:18:16 -03:00
Kien Nguyen
577bb50a04 Add Masakari Ansible role
Masakari provides Instances High Availability Service for
OpenStack clouds by automatically recovering failed Instances.

Depends-On: https://review.openstack.org/#/c/615469/
Change-Id: I0b3457232ee86576022cff64eb2e227ff9bbf0aa
Implements: blueprint ansible-masakari
Co-Authored-By: Gaëtan Trellu <gaetan.trellu@incloudus.com>
2019-08-15 09:58:53 -04:00
Radosław Piliszek
03b4c706fa Allow cinder coordination backend to be configured
This is to allow operator to prevent enabling redis and/or
etcd from magically configuring cinder coordinator.

Note this change is backwards-compatible.

Change-Id: Ie10be55968e43e3b9cc347b1b58771c1f7b1b910
Related-Bug: #1840070
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-08-15 07:48:28 +00:00
Zuul
17029c7e71 Merge "Configure Telegraf to monitor Docker containers" 2019-08-14 14:00:17 +00:00
Zuul
495be668d8 Merge "Add support for Swift S3 API" 2019-08-14 12:28:19 +00:00
Scott Solkhon
d72b27f2d1 Add support for Swift S3 API
This feature is disabled by default, and can be enabled by setting
'enable_swift_s3api' to 'true' in globals.yml.

Two middlewares are required for Swift S3 - s3api and s3token. Additionally, we
need to configure the authtoken middleware to delay auth decisions to give
s3token a chance to authorise requests using EC2 credentials.

Change-Id: Ib8e8e3a1c2ab383100f3c60ec58066e588d3b4db
2019-08-14 09:55:35 +00:00
Keith Plant
b95ff2d1db Configure Telegraf to monitor Docker containers
Added configuration to ansible/roles/telegraf/templates/telegraf.conf.j2 to
allow telegraf to grab telemetry data from docker directly.

Added option to etc/kolla/globals.yml to switch on/off the configuration to
ingest data from the docker daemon into telegraf.

Change-Id: Icbebc415d643a237fa128840d5f5a9c91d22c12d
Signed-off-by: Keith Plant <kplantjr@gmail.com>
2019-08-13 08:17:00 -04:00
Marcin Juszkiewicz
bf7ed6be04 Set 'distro_python_version' variable
We use that variable in Kolla in many places. There are places in
'kolla-ansible' where we also need it.

Change-Id: Iea78c4a7cb0fd1405ea7299cdcf0841f63820c8c
2019-08-12 13:23:42 +00:00
Zuul
9a652b29e5 Merge "Support mon and osd to be named with hostname" 2019-08-06 13:59:08 +00:00
wangwei
cd519db139 Support mon and osd to be named with hostname
In the current deployment of ceph, the node name of osd and the name
of mon are both IP, and other daemons use hostname.

This commit adds support for naming mon and osd nodes using hostname,
and does not change the default ip-named way.

Change-Id: I22bef72dcd8fc8bcd391ae30e4643520250fd556
2019-08-05 08:54:01 +00:00
Zuul
6ef646856f Merge "Remove unnecessary option from group_vars/all.yml" 2019-08-03 18:09:03 +00:00
chenxing
a1ab06d244 Remove unnecessary option from group_vars/all.yml
We often specific the project name after "{{ node_config_directory }}",
for example,
``{{ node_config_directory }}/cinder-api/:{{ container_config_directory }}/:ro``.
As the  "{{ project }}" option is not configured, This line was
generated with:
``/etc/kolla//cinder-api/:...``
There would be double slash exists. It's OK, but confusing.

Change-Id: I82e6a91b2c541e38cf8e97896842149b31244688
Closes-Bug: #1838259
2019-08-02 09:53:45 +08:00