10158 Commits

Author SHA1 Message Date
Zuul
8daad1abcf Merge "Wait for all compute services before cell discovery" 2019-07-05 10:31:29 +00:00
Zuul
dfa1a3844d Merge "Add upgrade-bifrost command" 2019-07-05 09:17:16 +00:00
Zuul
70b7cddd2b Merge "Add parameters to configure number of processes and threads of horizon" 2019-07-05 09:17:14 +00:00
Zuul
af8ae0aa41 Merge "Simplify handler conditionals" 2019-07-04 21:34:14 +00:00
Christian Berendt
dc3489df18 Add parameters to configure number of processes and threads of horizon
Change-Id: Ib5490d504a5b7c9a37dda7babf1257aa661c11de
2019-07-04 17:23:50 +02:00
Mark Goddard
c38dd76711 Wait for all compute services before cell discovery
There is a race condition during nova deploy since we wait for at least
one compute service to register itself before performing cells v2 host
discovery.  It's quite possible that other compute nodes will not yet
have registered and will therefore not be discovered. This leaves them
not mapped into a cell, and results in the following error if the
scheduler picks one when booting an instance:

Host 'xyz' is not mapped to any cell

The problem has been exacerbated by merging a fix [1][2] for a nova race
condition, which disabled the dynamic periodic discovery mechanism in
the nova scheduler.

This change fixes the issue by waiting for all expected compute services
to register themselves before performing host discovery. This includes
both virtualised compute services and bare metal compute services.

[1] https://bugs.launchpad.net/kolla-ansible/+bug/1832987
[2] https://review.opendev.org/665554

Change-Id: I2915e2610e5c0b8d67412e7ec77f7575b8fe9921
Closes-Bug: #1835002
2019-07-04 13:03:12 +00:00
Zuul
26f2aecfa1 Merge "Don't rotate keystone fernet keys during deploy" 2019-07-04 10:18:28 +00:00
Zuul
56c3603586 Merge "CI: Keep stderr in ansible logs" 2019-07-04 07:45:54 +00:00
Zuul
2ad7b50010 Merge "Cloudkitty InfluxDB Storage backend via Kolla-ansible" 2019-07-04 03:45:40 +00:00
Zuul
b5babe0f39 Merge "CI: set the same gate queue for kolla and kolla-ansible" 2019-07-03 21:21:52 +00:00
Zuul
f95360d588 Merge "Update the UPPER_CONSTRAINTS_FILE to releases.openstack.org" 2019-07-03 19:28:15 +00:00
Zuul
6aba50e66a Merge "CI: Use template-overrides.j2 from kolla" 2019-07-03 19:22:21 +00:00
Radosław Piliszek
9c815fa503 CI: set the same gate queue for kolla and kolla-ansible
This is to ensure that any Depends-On does not cause Zuul not to pick up
the change for gating due to no notifications between queues.
Previously W+1-ing a change which depended on non-merged change from
the other project caused it to remain in the same state.

Change-Id: Ib2d88471ac5730c00b5a9721066d1fb3f2998c9c
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-03 18:04:12 +00:00
gujin
f41531851f Update the UPPER_CONSTRAINTS_FILE to releases.openstack.org
1. Update the UPPER_CONSTRAINTS_FILE to releases.openstack.org[1]
2. Blacklist sphinx 2.1.0[2]

[1]: http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006478.html
[2]: https://github.com/sphinx-doc/sphinx/issues/6440

Change-Id: Ie5f9ae1bd5c45617c6b7fde0e490d471e172c24e
2019-07-03 15:30:44 +00:00
Radosław Piliszek
fd0607dc47 Fix deploy guide build (missing kolla project reference)
Change-Id: I9e3650e83c72081ef2679fe01842bb9be6a4eb7c
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-03 09:02:38 +02:00
Radosław Piliszek
b9aa8b38f4 CI: Keep stderr in ansible logs
Otherwise ara had only the stderr part and logs only the
stdout part which made ordered analysis harder.

Additionally add -vvv for the bootstrap-servers run.

Change-Id: Ia42ac9b90a17245e9df277c40bda24308ebcd11d
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-02 20:44:33 +02:00
Rafael Weingärtner
97cb30cdd8 Cloudkitty InfluxDB Storage backend via Kolla-ansible
This proposal will add support to Kolla-Ansible for Cloudkitty
 InfluxDB storage system deployment. The feature of InfluxDB as the
 storage backend for Cloudkitty was created with the following commit
 https://github.com/openstack/cloudkitty/commit/
 c4758e78b49386145309a44623502f8095a2c7ee

Problem Description
===================

With the addition of support for InfluxDB in Cloudkitty, which is
achieving general availability via Stein release, we need a method to
easily configure/support this storage backend system via Kolla-ansible.

Kolla-ansible is already able to deploy and configure an InfluxDB
system. Therefore, this proposal will use the InfluxDB deployment
configured via Kolla-ansible to connect to CloudKitty and use it as a
storage backend.

If we do not provide a method for users (operators) to manage
Cloudkitty storage backend via Kolla-ansible, the user has to execute
these changes/configurations manually (or via some other set of
automated scripts), which creates distributed set of configuration
files, "configurations" scripts that have different versioning schemas
and life cycles.

Proposed Change
===============

Architecture
------------

We propose a flag that users can use to make Kolla-ansible configure
CloudKitty to use InfluxDB as the storage backend system. When
enabling this flag, Kolla-ansible will also enable the deployment of
the InfluxDB via Kolla-ansible automatically.

CloudKitty will be configured accordingly to [1] and [2]. We will also
externalize the "retention_policy", "use_ssl", and "insecure", to
allow fine granular configurations to operators. All of these
configurations will only be used when configured; therefore, when they
are not set, the default value/behavior defined in Cloudkitty will be
used. Moreover, when we configure "use_ssl" to "true", the user will
be able to set "cafile" to a custom trusted CA file. Again, if these
variables are not set, the default ones in Cloudkitty will be used.

Implementation
--------------
We need to introduce a new variable called
`cloudkitty_storage_backend`. Valid options are `sqlalchemy` or
`influxdb`. The default value in Kolla-ansible is `sqlalchemy` for
backward compatibility. Then, the first step is to change the
definition for the following variable:
`/ansible/group_vars/all.yml:enable_influxdb: "{{ enable_monasca |
bool }}"`

We also need to enable InfluxDB when CloudKitty is configured to use
it as the storage backend. Afterwards, we need to create tasks in
CloudKitty configurations to create the InfluxDB schema and configure
the configuration files accordingly.

Alternatives
------------
The alternative would be to execute the configurations manually or
handle it via a different set of scripts and configurations files,
which can become cumbersome with time.

Security Impact
---------------
None identified by the author of this spec

Notifications Impact
--------------------
Operators that are already deploying CloudKitty with InfluxDB as
storage backend would need to convert their configurations to
Kolla-ansible (if they wish to adopt Kolla-ansible to execute these
tasks).

Also, deployments (OpenStack environments) that were created with
Cloudkitty using storage v1 will need to migrate all of their data to
V2 before enabling InfluxDB as the storage system.

Other End User Impact
---------------------
None.

Performance Impact
------------------
None.

Other Deployer Impact
---------------------
New configuration options will be available for CloudKitty.
* cloudkitty_storage_backend
* cloudkitty_influxdb_retention_policy
* cloudkitty_influxdb_use_ssl
* cloudkitty_influxdb_cafile
* cloudkitty_influxdb_insecure_connections
* cloudkitty_influxdb_name

Developer Impact
----------------
None

Implementation
==============

Assignee
--------
* `Rafael Weingärtner <rafaelweingartne>`

Work Items
----------
 * Extend InfluxDB "enable/disable" variable
 * Add new tasks to configure Cloudkitty accordingly to these new
 variables that are presented above
 * Write documentation and release notes

Dependencies
============
None

Documentation Impact
====================
New documentation for the feature.

References
==========
[1] `https://docs.openstack.org/cloudkitty/latest/admin/configuration/storage.html#influxdb-v2`
[2] `https://docs.openstack.org/cloudkitty/latest/admin/configuration/collector.html#metric-collection`

Change-Id: I65670cb827f8ca5f8529e1786ece635fe44475b0
Signed-off-by: Rafael Weingärtner <rafael@apache.org>
2019-07-02 11:14:05 -03:00
Mark Goddard
9cac1137d0 Add upgrade-bifrost command
This performs the same as a deploy-bifrost, but first stops the
bifrost services and container if they are running.

This can help where a docker stop may lead to an ungraceful shutdown,
possibly due to running multiple services in one container.

Change-Id: I131ab3c0e850a1d7f5c814ab65385e3a03dfcc74
Implements: blueprint bifrost-upgrade
Closes-Bug: #1834332
2019-07-02 14:30:14 +01:00
Zuul
8b1e637905 Merge "Specify endpoint when creating monasca user" 2019-07-02 09:01:46 +00:00
Radosław Piliszek
20ab480ca5 CI: Use template-overrides.j2 from kolla
Some kolla-ansible jobs failed due to using external mirrors
instead of local ones.
This was due to not using the template override provided by kolla.
This patch fixes that.

Depends-On: https://review.opendev.org/668226
Change-Id: I27f714fdf05e521aa8ce25c5683a452ceb35eeb8
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-01 17:00:53 +00:00
Radosław Piliszek
a0bdc3669a Add note to CI config regarding registry during upgrade
Change-Id: Ifc898015b9b523ef4c50fc969e464f05762f2151
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-01 18:45:30 +02:00
Zuul
470108a1c0 Merge "Revert "CI - remove unnecessary logic when building images for upgrade"" 2019-07-01 15:48:47 +00:00
Mark Goddard
acac12798c Revert "CI - remove unnecessary logic when building images for upgrade"
This reverts commit 8ce5ffd0c21c221d88bacca5fec03ca042dfed85.

Change-Id: I81ce7c007ff267ebbbb721bcdb7eebc0dd575bf8
2019-07-01 11:12:58 +00:00
Will Szumski
9074da56a7 Specify endpoint when creating monasca user
otherwise I'm seeing:

TASK [monasca : Creating the monasca agent user] ****************************************************************************************************************************
fatal: [monitor1]: FAILED! => {"changed": false, "module_stderr": "Shared connection to 172.16.3.24 closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n  F
ile \"/tmp/ansible_I0RmxQ/ansible_module_kolla_toolbox.py\", line 163, in <module>\r\n    main()\r\n  File \"/tmp/ansible_I0RmxQ/ansible_module_kolla_toolbox.py\", line 141,
 in main\r\n    output = client.exec_start(job)\r\n  File \"/opt/kayobe/venvs/kolla-ansible/lib/python2.7/site-packages/docker/utils/decorators.py\", line 19, in wrapped\r\n
    return f(self, resource_id, *args, **kwargs)\r\n  File \"/opt/kayobe/venvs/kolla-ansible/lib/python2.7/site-packages/docker/api/exec_api.py\", line 165, in exec_start\r\
n    return self._read_from_socket(res, stream, tty)\r\n  File \"/opt/kayobe/venvs/kolla-ansible/lib/python2.7/site-packages/docker/api/client.py\", line 377, in _read_from_
socket\r\n    return six.binary_type().join(gen)\r\n  File \"/opt/kayobe/venvs/kolla-ansible/lib/python2.7/site-packages/docker/utils/socket.py\", line 75, in frames_iter\r\
n    n = next_frame_size(socket)\r\n  File \"/opt/kayobe/venvs/kolla-ansible/lib/python2.7/site-packages/docker/utils/socket.py\", line 62, in next_frame_size\r\n    data = read_exactly(socket, 8)\r\n  File \"/opt/kayobe/venvs/kolla-ansible/lib/python2.7/site-packages/docker/utils/socket.py\", line 47, in read_exactly\r\n    next_data = read(socket, n - len(data))\r\n  File \"/opt/kayobe/venvs/kolla-ansible/lib/python2.7/site-packages/docker/utils/socket.py\", line 31, in read\r\n    return socket.recv(n)\r\nsocket.timeout: timed out\r\n", "msg": "MODULE FAILURE", "rc": 1}

when the monitoring nodes aren't on the public API network.

Change-Id: I7a93f69da0e02c9264da0b081d2e60626f899e3a
2019-06-28 18:36:24 +01:00
Mark Goddard
de00bf491d Simplify handler conditionals
Currently, we have a lot of logic for checking if a handler should run,
depending on whether config files have changed and whether the
container configuration has changed. As rm_work pointed out during
the recent haproxy refactor, these conditionals are typically
unnecessary - we can rely on Ansible's handler notification system
to only trigger handlers when they need to run. This removes a lot
of error prone code.

This patch removes conditional handler logic for all services. It is
important to ensure that we no longer trigger handlers when unnecessary,
because without these checks in place it will trigger a restart of the
containers.

Implements: blueprint simplify-handlers

Change-Id: I4f1aa03e9a9faaf8aecd556dfeafdb834042e4cd
2019-06-27 15:57:19 +00:00
Zuul
54856a873f Merge "Disable and remove OracleLinux CI jobs" 2019-06-27 14:14:38 +00:00
Zuul
85b9dabcd4 Merge "Add support for neutron custom dnsmasq.conf" 2019-06-27 13:59:42 +00:00
Zuul
651b983bdb Merge "Restart all nova services after upgrade" 2019-06-27 13:39:12 +00:00
Zuul
e8f210a2d4 Merge "Format internal Fluentd logs" 2019-06-27 12:38:14 +00:00
Zuul
01bc357d0b Merge "Don't drop unmatched Kolla service logs" 2019-06-27 12:25:11 +00:00
Zuul
067e40ad32 Merge "Increase log coverage for Monasca" 2019-06-27 12:22:20 +00:00
Zuul
e7c19b7413 Merge "Enable InfluxDB TSI by default" 2019-06-27 11:44:51 +00:00
Zuul
e5ad12c429 Merge "doc: Start using openstackdoctheme's extlink extension" 2019-06-27 11:33:48 +00:00
Christian Berendt
a3f1ded357 Add support for neutron custom dnsmasq.conf
Change-Id: Ia7041be384ac07d0a790c2c5c68b1b31ff0e567a
2019-06-27 12:20:12 +02:00
Mark Goddard
e6d2b92200 Restart all nova services after upgrade
During an upgrade, nova pins the version of RPC calls to the minimum
seen across all services. This ensures that old services do not receive
data they cannot handle. After the upgrade is complete, all nova
services are supposed to be reloaded via SIGHUP to cause them to check
again the RPC versions of services and use the new latest version which
should now be supported by all running services.

Due to a bug [1] in oslo.service, sending services SIGHUP is currently
broken. We replaced the HUP with a restart for the nova_compute
container for bug 1821362, but not other nova services. It seems we need
to restart all nova services to allow the RPC version pin to be removed.

Testing in a Queens to Rocky upgrade, we find the following in the logs:

Automatically selected compute RPC version 5.0 from minimum service
version 30

However, the service version in Rocky is 35.

There is a second issue in that it takes some time for the upgraded
services to update the nova services database table with their new
version. We need to wait until all nova-compute services have done this
before the restart is performed, otherwise the RPC version cap will
remain in place. There is currently no interface in nova available for
checking these versions [2], so as a workaround we use a configurable
delay with a default duration of 30 seconds. Testing showed it takes
about 10 seconds for the version to be updated, so this gives us some
headroom.

This change restarts all nova services after an upgrade, after a 30
second delay.

[1] https://bugs.launchpad.net/oslo.service/+bug/1715374
[2] https://bugs.launchpad.net/nova/+bug/1833542

Change-Id: Ia6fc9011ee6f5461f40a1307b72709d769814a79
Closes-Bug: #1833069
Related-Bug: #1833542
2019-06-27 09:36:20 +00:00
Mark Goddard
09e29d0db9 Don't rotate keystone fernet keys during deploy
When running deploy or reconfigure for Keystone,
ansible/roles/keystone/tasks/deploy.yml calls init_fernet.yml,
which runs /usr/bin/fernet-rotate.sh, which calls keystone-manage
fernet_rotate.

This means that a token can become invalid if the operator runs
deploy or reconfigure too often.

This change splits out fernet-push.sh from the fernet-rotate.sh
script, then calls fernet-push.sh after the fernet bootstrap
performed in deploy.

Change-Id: I824857ddfb1dd026f93994a4ac8db8f80e64072e
Closes-Bug: #1833729
2019-06-27 08:41:27 +00:00
Radosław Piliszek
cc058f4586 Make nova external ceph key extraction tasks non-changing
They are used only to obtain keys for the next task.

Change-Id: I2fac22af4710b70e4df8e3a272bcfb6cc8b8532e
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-06-26 14:21:20 +02:00
Zuul
f9ca973869 Merge "Do not run Zuul kolla-ansible-base jobs when modifying globals.yml" 2019-06-26 09:27:09 +00:00
Zuul
100a20769f Merge "Add gnocchi extra volumes" 2019-06-25 18:38:37 +00:00
Zuul
6db1db57df Merge "Add missing docker_registry_insecure in globals.yml" 2019-06-24 15:29:54 +00:00
Zuul
693a30275f Merge "Add CI job for ironic" 2019-06-24 15:15:41 +00:00
Zuul
b32ddaa901 Merge "link kolla_logs volume to docker_runtime_directory if docker_runtime_directory variable exists" 2019-06-24 13:35:45 +00:00
Zuul
2ef50535fe Merge "Use'openstack_region_name' in cloudkitty collectors and fetchers" 2019-06-24 13:08:30 +00:00
Zuul
7cfab57cb9 Merge "Method to override the default ceilometer meters.yaml via Kolla-ansible" 2019-06-24 13:08:28 +00:00
Zuul
417fa831bc Merge "Use 'openstack_service_workers' as the nb of Cloudkitty workers" 2019-06-24 13:08:26 +00:00
Zuul
a956c53181 Merge "Remove `hnas_iscsi` from the supported storage backends list of Cinder" 2019-06-24 13:08:24 +00:00
Zuul
03976399f0 Merge "Avoid parallel discover_hosts (nova-related race condition)" 2019-06-24 13:08:23 +00:00
Zuul
0cbebc4786 Merge "Fix the redis_connection_string for osprofiler and make it generic" 2019-06-24 13:08:21 +00:00
Zuul
4fb6c2d90f Merge "Add some notes for users Migrating to Kolla Monasca" 2019-06-24 13:08:20 +00:00
Zuul
8622d6fcc6 Merge "CI - remove unnecessary logic when building images for upgrade" 2019-06-24 13:08:18 +00:00