10283 Commits

Author SHA1 Message Date
Zuul
aa135e37f7 Merge "Standardize the configuration of "oslo_messaging" section" 2019-08-15 20:04:56 +00:00
Zuul
fac646406f Merge "Testing Masakari role in gate" 2019-08-15 17:26:56 +00:00
Zuul
bf372c2502 Merge "Add Masakari Ansible role" 2019-08-15 16:36:44 +00:00
Rafael Weingärtner
22a6223b1b Standardize the configuration of "oslo_messaging" section
After all of the discussions we had on
"https://review.opendev.org/#/c/670626/2", I studied all projects that
have an "oslo_messaging" section. Afterwards, I applied the same method
that is already used in "oslo_messaging" section in Nova, Cinder, and
others. This guarantees that we have a consistent method to
enable/disable notifications across projects based on components (e.g.
Ceilometer) being enabled or disabled. Here follows the list of
components, and the respective changes I did.

* Aodh:
The section is declared, but it is not used. Therefore, it will
be removed in an upcomming PR.

* Congress:
The section is declared, but it is not used. Therefore, it will
be removed in an upcomming PR.

* Cinder:
It was already properly configured.

* Octavia:
The section is declared, but it is not used. Therefore, it will
be removed in an upcomming PR.

* Heat:
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

* Ceilometer:
Ceilometer publishes some messages in the rabbitMQ. However, the
default driver is "messagingv2", and not ''(empty) as defined in Oslo;
these configurations are defined in ceilometer/publisher/messaging.py.
Therefore, we do not need to do anything for the
"oslo_messaging_notifications" section in Ceilometer

* Tacker:
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

* Neutron:
It was already properly configured.

* Nova
It was already properly configured. However, we found another issue
with its configuration. Kolla-ansible does not configure nova
notifications as it should. If 'searchlight' is not installed (enabled)
the 'notification_format' should be 'unversioned'. The default is
'both'; so nova will send a notification to the queue
versioned_notifications; but that queue has no consumer when
'searchlight' is disabled. In our case, the queue got 511k messages.
The huge amount of "stuck" messages made the Rabbitmq cluster
unstable.

https://bugzilla.redhat.com/show_bug.cgi?id=1478274
https://bugs.launchpad.net/ceilometer/+bug/1665449

* Nova_hyperv:
I added the same configurations as in Nova project.

* Vitrage
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

* Searchlight
I created a mechanism similar to what we have in AODH, Cinder, Nova,
and others.

* Ironic
I created a mechanism similar to what we have in AODH, Cinder, Nova,
and others.

* Glance
It was already properly configured.

* Trove
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

* Blazar
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

* Sahara
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

* Watcher
I created a mechanism similar to what we have in AODH, Cinder, Nova,
and others.

* Barbican
I created a mechanism similar to what we have in Cinder, Nova,
and others. I also added a configuration to 'keystone_notifications'
section. Barbican needs its own queue to capture events from Keystone.
Otherwise, it has an impact on Ceilometer and other systems that are
connected to the "notifications" default queue.

* Keystone
Keystone is the system that triggered this work with the discussions
that followed on https://review.opendev.org/#/c/670626/2. After a long
discussion, we agreed to apply the same approach that we have in Nova,
Cinder and other systems in Keystone. That is what we did. Moreover, we
introduce a new topic "barbican_notifications" when barbican is
enabled. We also removed the "variable" enable_cadf_notifications, as
it is obsolete, and the default in Keystone is CADF.

* Mistral:
It was hardcoded "noop" as the driver. However, that does not seem a
good practice. Instead, I applied the same standard of using the driver
and pushing to "notifications" queue if Ceilometer is enabled.

* Cyborg:
I created a mechanism similar to what we have in AODH, Cinder, Nova,
and others.

* Murano
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

* Senlin
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

* Manila
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

* Zun
The section is declared, but it is not used. Therefore, it will
be removed in an upcomming PR.

* Designate
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

* Magnum
It was already using a similar scheme; I just modified it a little bit
to be the same as we have in all other components

Closes-Bug: #1838985

Change-Id: I88bdb004814f37c81c9a9c4e5e491fac69f6f202
Signed-off-by: Rafael Weingärtner <rafael@apache.org>
2019-08-15 13:18:16 -03:00
Kien Nguyen
577bb50a04 Add Masakari Ansible role
Masakari provides Instances High Availability Service for
OpenStack clouds by automatically recovering failed Instances.

Depends-On: https://review.openstack.org/#/c/615469/
Change-Id: I0b3457232ee86576022cff64eb2e227ff9bbf0aa
Implements: blueprint ansible-masakari
Co-Authored-By: Gaëtan Trellu <gaetan.trellu@incloudus.com>
2019-08-15 09:58:53 -04:00
Zuul
6db0892fc7 Merge "Fix idempotency of fluentd customisations" 2019-08-15 08:40:34 +00:00
Zuul
dda1885151 Merge "Enable iscsid on cinder-backup hosts" 2019-08-15 03:34:37 +00:00
Zuul
f27a19680b Merge "Add missing when condition for swift config files" 2019-08-14 20:07:12 +00:00
Zuul
9401aab752 Merge "CI: Sanity check that nodepool.private_ipv4 is assigned" 2019-08-14 17:21:29 +00:00
Scott Solkhon
8acbb32b95 Add missing when condition for swift config files
Change-Id: If5bba855a6e34c971fdb1ceb6f10dba62e54b811
2019-08-14 16:52:42 +00:00
Kien Nguyen
fbac54c5f5 Testing Masakari role in gate
Add Masakari testing into the Gate.

Change-Id: I52df33f963e7d2ae4059887df3d24d9e6642134e
Depends-On: https://review.opendev.org/#/c/615469/
Depends-On: https://review.opendev.org/#/c/615715
Implements: blueprint ansible-masakari
Co-Authored-By: Gaëtan Trellu <gaetan.trellu@incloudus.com>
2019-08-14 12:32:51 -04:00
Scott Solkhon
dcaa5f0b3d Fix idempotency of fluentd customisations
Fix fluentd config from overwriting custom config with the same filename

Closes-Bug: #1840166
Change-Id: I42c5446381033015f590901b2120950d602f847f
2019-08-14 15:53:49 +00:00
Zuul
b599f78dd7 Merge "Add missing Octavia policy file to Horizon" 2019-08-14 15:27:38 +00:00
Zuul
17029c7e71 Merge "Configure Telegraf to monitor Docker containers" 2019-08-14 14:00:17 +00:00
Zuul
495be668d8 Merge "Add support for Swift S3 API" 2019-08-14 12:28:19 +00:00
Scott Solkhon
b3d07a4b52 Add missing Octavia policy file to Horizon
This commit adds the missing policy file for Octavia
in Horizon, thus enabling the panel where appropriate.

Change-Id: I60f1a52de71519f2d8bd84baa8aba5700fa75b1c
2019-08-14 12:00:59 +00:00
Scott Solkhon
d72b27f2d1 Add support for Swift S3 API
This feature is disabled by default, and can be enabled by setting
'enable_swift_s3api' to 'true' in globals.yml.

Two middlewares are required for Swift S3 - s3api and s3token. Additionally, we
need to configure the authtoken middleware to delay auth decisions to give
s3token a chance to authorise requests using EC2 credentials.

Change-Id: Ib8e8e3a1c2ab383100f3c60ec58066e588d3b4db
2019-08-14 09:55:35 +00:00
Zuul
64d587b819 Merge "Fix swift log level configuration" 2019-08-13 17:03:47 +00:00
Scott Solkhon
dea87cde97 Fix swift log level configuration
Change-Id: I7f980640e75a9328a14a3e14e9c55358955f3182
2019-08-13 12:28:38 +00:00
Keith Plant
b95ff2d1db Configure Telegraf to monitor Docker containers
Added configuration to ansible/roles/telegraf/templates/telegraf.conf.j2 to
allow telegraf to grab telemetry data from docker directly.

Added option to etc/kolla/globals.yml to switch on/off the configuration to
ingest data from the docker daemon into telegraf.

Change-Id: Icbebc415d643a237fa128840d5f5a9c91d22c12d
Signed-off-by: Keith Plant <kplantjr@gmail.com>
2019-08-13 08:17:00 -04:00
Zuul
5c70e0a615 Merge "Set 'distro_python_version' variable" 2019-08-13 04:00:55 +00:00
Zuul
571c89712d Merge "CI: Collect docker and systemd configs" 2019-08-12 17:19:36 +00:00
Marcin Juszkiewicz
bf7ed6be04 Set 'distro_python_version' variable
We use that variable in Kolla in many places. There are places in
'kolla-ansible' where we also need it.

Change-Id: Iea78c4a7cb0fd1405ea7299cdcf0841f63820c8c
2019-08-12 13:23:42 +00:00
Zuul
b16bb0d787 Merge "Do not require EPEL repo on RHEL-based target hosts" 2019-08-10 00:33:53 +00:00
Zuul
4468250b95 Merge "Remove support for Docker legacy packages" 2019-08-09 15:27:09 +00:00
Zuul
3a37131f1d Merge "Fix FWaaS service provider (v2, Stein issue)" 2019-08-09 12:05:56 +00:00
Radosław Piliszek
85a5fb55c4 Fix FWaaS service provider (v2, Stein issue)
Because we merged both [1] and [2] in master,
we got broken FWaaS.
This patch unbreaks it and is required to backport
to Stein due to [2] backport waiting for merge,
while [1] is already backported.

[1] https://review.opendev.org/661704
[2] https://review.opendev.org/668406

Change-Id: I74427ce9b937c42393d86574614603bd788606af
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-08-08 20:26:57 +02:00
Doug Szumski
339ea2bdeb Support namespacing RabbitMQ logs
The RabbitMQ role supports namespacing the service via the
project_name. For example, if you change the project_name, the
container name and config directory will be renamed accordingly. However
the log folder is currently fixed, even though the service tries to
write to one named after the project_name. This change fixes that.

Whilst you might generally use vhosts, running multiple RabbitMQ
services on a single node is useful at the very least for testing,
or for running 'outward RabbitMQ' on the same node.

This change is part of the work to support Cells v2.

Partially Implements: blueprint support-nova-cells
Change-Id: Ied2c24c01571327ea532ba0aaf2fc5e89de8e1fb
2019-08-08 16:46:32 +00:00
Zuul
ee5e99fcf5 Merge "Stop using MountFlags=shared in Docker configuration" 2019-08-08 10:57:03 +00:00
Michal Nasiadka
ad9e8786a3 Add support for sha256 in ceph key distribution
- add support for sha256 in bslurp module
- change sha1 to sha256 in ceph-mon ansible role

Depends-On: https://review.opendev.org/655623
Change-Id: I25e28d150f2a8d4a7f87bb119d9fb1c46cfe926f
Closes-Bug: #1826327
2019-08-07 11:57:46 +00:00
Marcin Juszkiewicz
35941738d5 Stop using MountFlags=shared in Docker configuration
According to Docker upstream release notes [1] MountFlags should be
empty.

1. https://docs.docker.com/engine/release-notes/#18091

"Important notes about this release

In Docker versions prior to 18.09, containerd was managed by the Docker
engine daemon. In Docker Engine 18.09, containerd is managed by systemd.
Since containerd is managed by systemd, any custom configuration to the
docker.service systemd configuration which changes mount settings (for
example, MountFlags=slave) breaks interactions between the Docker Engine
daemon and containerd, and you will not be able to start containers.

Run the following command to get the current value of the MountFlags
property for the docker.service:

sudo systemctl show --property=MountFlags docker.service
MountFlags=

Update your configuration if this command prints a non-empty value for
MountFlags, and restart the docker service."

Closes-bug: #1833835

Change-Id: I4f4cbb09df752d00073a606463c62f0a6ca6c067
2019-08-07 13:50:46 +02:00
Mark Goddard
ec07524054 Enable iscsid on cinder-backup hosts
Without this we may see the following error in cinder-backup when using
the LVM backend:

    Could not login to any iSCSI portal

Enabling the iscsid container on hosts in the cinder-backup group fixes
this.

Closes-Bug: #1838624

Change-Id: If373c002b0744ce9dbdffed50a02bab55dd0acb9
Co-Authored-By: dmitry-a-grachev <dmitry.a.grachev@gmail.com>
2019-08-07 09:05:43 +01:00
Mark Goddard
eac1e479b7 CI: Sanity check that nodepool.private_ipv4 is assigned
During the MariaDB testing we saw a number of cases where this IP
address was not assigned to one or more hosts, which caused various
issues later on.

Change-Id: I61b54483e4553b926e9ddc0a8848b2daa6bc49f1
2019-08-06 19:03:05 +01:00
Mark Goddard
f63e36780b Remove support for Docker legacy packages
Docker is now always installed using the community edition (CE)
packages.

Change-Id: I8c3fe44fd9d2da99b5bb1c0ec3472d7e1b5fb295
2019-08-06 18:34:19 +01:00
Zuul
3731da0b79 Merge "Add mon address to ceph release version check" 2019-08-06 17:04:13 +00:00
Zuul
9a652b29e5 Merge "Support mon and osd to be named with hostname" 2019-08-06 13:59:08 +00:00
Zuul
418e9cccc7 Merge "ceph: fixes to deployment and upgrade" 2019-08-06 13:59:06 +00:00
Zuul
ca1de25fbf Merge "Add Kafka input to telegraf config" 2019-08-05 10:58:05 +00:00
Zuul
5760cc226b Merge "Fix checking mongodb replication status" 2019-08-05 09:02:05 +00:00
Zuul
8f70bc22d6 Merge "Add extra volumes support for services that were not previously supported" 2019-08-05 09:02:04 +00:00
wangwei
cd519db139 Support mon and osd to be named with hostname
In the current deployment of ceph, the node name of osd and the name
of mon are both IP, and other daemons use hostname.

This commit adds support for naming mon and osd nodes using hostname,
and does not change the default ip-named way.

Change-Id: I22bef72dcd8fc8bcd391ae30e4643520250fd556
2019-08-05 08:54:01 +00:00
Zuul
daba362f43 Merge "Handle more return codes from nova-status upgrade check" 2019-08-05 08:42:10 +00:00
Zuul
8615adefbc Merge "[gnocchi] Don't recursively modify file perms on start" 2019-08-05 08:42:08 +00:00
Zuul
46f0b691dc Merge "CI: Fix multinode job glance issues" 2019-08-05 08:13:58 +00:00
pangliye
93e868360d Add Kafka input to telegraf config
Change-Id: I9a8d3dc5f311d4ea4e5d9b03d522632abc66a7ac
2019-08-05 07:26:46 +00:00
Radosław Piliszek
67cedb7ad5 Do not require EPEL repo on RHEL-based target hosts
This change makes kolla-ansible more compatible with
RHEL which does not provide epel-release package.

EPEL was required to install simplejson from rpm
which was an ansible requirement when used python
version was below 2.5 ([1]). This has been obsolete for
quite a time so it's a good idea to get rid of it.

This change includes update of docs to read more properly.

[1] https://docs.ansible.com/ansible/2.3/intro_installation.html

Change-Id: I825431d41fbceb824baff27130d64dabe4475d33
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-08-05 09:01:49 +02:00
Radosław Piliszek
826f6850d0 ceph: fixes to deployment and upgrade
1) ceph-nfs (ganesha-ceph) - use NFSv4 only
This is recommended upstream.
v3 and UDP require portmapper (aka rpcbind) which we
do not want, except where Ubuntu ganesha version (2.6)
forces it by requiring enabled UDP, see [1].
The issue has been fixed in 2.8, included in CentOS.
Additionally disable v3 helper protocols and kerberos
to avoid meaningless warnings.

2) ceph-nfs (ganesha-ceph) - do not export host dbus
It is not in use. This avoids the temptation to try
handling it on host.

3) Properly handle ceph services deploy and upgrade
Upgrade runs deploy.
The order has been corrected - nfs goes after mds.
Additionally upgrade takes care of rgw for keystone
(for swift emulation).

4) Enhance ceph keyring module with error detection
Now it does not blindly try to create a keyring after
any failure. This used to hide real issue.

5) Retry ceph admin keyring update until cluster works
Reordering deployment caused issue with ceph cluster not being
fully operational before taking actions on it.

6) CI: Remove osd df from collected logs as it may hang CI
Hangs are caused by healthy MON and no healthy MGR.
A descriptive note is left in its place.

7) CI: Add 5s timeout to ceph informational commands
This decreases the timeout from the default 300s.

[1] https://review.opendev.org/669315

Change-Id: I1cf0ad10b80552f503898e723f0c4bd00a38f143
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-08-05 06:26:25 +00:00
Zuul
6ef646856f Merge "Remove unnecessary option from group_vars/all.yml" 2019-08-03 18:09:03 +00:00
Zuul
b59791ca92 Merge "Fix handling of docker restart policy" 2019-08-03 16:27:46 +00:00
chenxing
a1ab06d244 Remove unnecessary option from group_vars/all.yml
We often specific the project name after "{{ node_config_directory }}",
for example,
``{{ node_config_directory }}/cinder-api/:{{ container_config_directory }}/:ro``.
As the  "{{ project }}" option is not configured, This line was
generated with:
``/etc/kolla//cinder-api/:...``
There would be double slash exists. It's OK, but confusing.

Change-Id: I82e6a91b2c541e38cf8e97896842149b31244688
Closes-Bug: #1838259
2019-08-02 09:53:45 +08:00