10192 Commits

Author SHA1 Message Date
Mark Goddard
7b9397566a Fix ironic inspector iPXE boot with UEFI
The ironic inspector iPXE configuration includes the following kernel
argument:

initrd=agent.ramdisk

However, the ramdisk is actually called ironic-agent.initramfs, so the
argument should be:

initrd=ironic-agent.initramfs

In BIOS boot mode this does not cause a problem, but for compute nodes
with UEFI enabled, it seems to be more strict about this, and fails to
boot.

Change-Id: Ic84f3b79fdd3cd1730ca2fb79c11c7a4e4d824de
Closes-Bug: #1836375
2019-07-12 15:09:56 +01:00
Zuul
ab3377d492 Merge "Language tweaks in multi-region docs for clarity" 2019-07-12 12:02:44 +00:00
Raimund Hook
fd07e3d911 Language tweaks in multi-region docs for clarity
Tweaked some of the language in doc/source/user/multi-regions.rst for
clarity purposes.

TrivialFix

Change-Id: Icdd8da6886d0e39da5da80c37d14d2688431ba8f
2019-07-12 12:45:10 +01:00
Mark Goddard
d5e5e885d1 During deploy, always sync DB
A common class of problems goes like this:

* kolla-ansible deploy
* Hit a problem, often in ansible/roles/*/tasks/bootstrap.yml
* Re-run kolla-ansible deploy
* Service fails to start

This happens because the DB is created during the first run, but for some
reason we fail before performing the DB sync. This means that on the second run
we don't include ansible/roles/*/tasks/bootstrap_service.yml because the DB
already exists, and therefore still don't perform the DB sync. However this
time, the command may complete without apparent error.

We should be less careful about when we perform the DB sync, and do it whenever
it is necessary. There is an argument for not doing the sync during a
'reconfigure' command, although we will not change that here.

This change only always performs the DB sync during 'deploy' and
'reconfigure' commands.

Change-Id: I82d30f3fcf325a3fdff3c59f19a1f88055b566cc
Closes-Bug: #1823766
Closes-Bug: #1797814
2019-07-12 08:56:54 +00:00
Zuul
103e0e43e7 Merge "init-runonce: make public1 network optional" 2019-07-11 09:30:53 +00:00
Zuul
768852f8d5 Merge "Fix the incorrect backup_driver configuration" 2019-07-10 16:50:25 +00:00
Zuul
fc42791e1f Merge "Update designate-guide cli command for dns_domain" 2019-07-10 16:48:34 +00:00
Mark Goddard
3026fd9129 init-runonce: make public1 network optional
Skip creation by setting ENABLE_EXT_NET to 0.

Since adding errexit we are failing in kayobe CI, since we have a
conflicting flat network on physnet1.

Change-Id: I88429f30eb81a286f4b8104d5e7a176eefaad667
2019-07-10 17:48:28 +01:00
Michal Nasiadka
4e3054b5da Add 'allow *' to getting ceph mds keyring
* Sometimes getting/creating ceph mds keyring fails, similar to https://tracker.ceph.com/issues/16255

Change-Id: I47587cbeb8be0e782c13ba7f40367409e2daa8a8
2019-07-10 13:09:38 +02:00
Raimund Hook
ec3fe167af Update designate-guide cli command for dns_domain
Updated the docs to refer to the openstack client, rather than the (old)
neutron client.

TrivialFix

Change-Id: I82011175f7206f52570a0f7d1c6863ad8fa08fd0
2019-07-10 10:57:35 +01:00
chenxing
8b55268d44 Fix the incorrect backup_driver configuration
The "backup_driver" option should be configured to
cinder.backup.drivers.ceph.CephBackupDriver instead of
cinder.backup.drivers.ceph.

Change-Id: I22457023c6ad76b508bcbe05e37517c18f1ffc81
Closes-Bug: #1832878
2019-07-10 16:06:35 +08:00
Radosław Piliszek
53ea3fe4af Trivial fix: log stderr of init-runonce as well
Missed by me in a recent merge.

TrivialFix
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>

Change-Id: I83b1e84a43f014ce20be8677868be3f66017e3c2
2019-07-09 15:38:47 +02:00
Zuul
8ec3ffc64b Merge "Fix nova deploy with Ansible<2.8" 2019-07-09 09:33:28 +00:00
Zuul
887938bbcb Merge "Exit on failure in init-runonce" 2019-07-09 07:33:46 +00:00
Zuul
48223fe83c Merge "Deprecate Ceph deployment" 2019-07-08 22:22:57 +00:00
Mark Goddard
5be093ac5a Fix nova deploy with Ansible<2.8
Due to a bug in ansible, kolla-ansible deploy currently fails in nova
with the following error when used with ansible earlier than 2.8:

TASK [nova : Waiting for nova-compute services to register themselves]
*********
task path:
/home/zuul/src/opendev.org/openstack/kolla-ansible/ansible/roles/nova/tasks/discover_computes.yml:30
fatal: [primary]: FAILED! => {
    "failed": true,
    "msg": "The field 'vars' has an invalid value, which
        includes an undefined variable. The error was:
        'nova_compute_services' is undefined\n\nThe error
        appears to have been in
        '/home/zuul/src/opendev.org/openstack/kolla-ansible/ansible/roles/nova/tasks/discover_computes.yml':
        line 30, column 3, but may\nbe elsewhere in the file
        depending on the exact syntax problem.\n\nThe
        offending line appears to be:\n\n\n- name: Waiting
        for nova-compute services to register themselves\n ^
            here\n"
}

Example:
http://logs.openstack.org/00/669700/1/check/kolla-ansible-centos-source/81b65b9/primary/logs/ansible/deploy

This was caused by
https://review.opendev.org/#/q/I2915e2610e5c0b8d67412e7ec77f7575b8fe9921,
which hits upon an ansible bug described here:
https://github.com/markgoddard/ansible-experiments/tree/master/05-referencing-registered-var-do-until.

We can work around this by not using an intermediary variable.

Change-Id: I58f8fd0a6e82cb614e02fef6e5b271af1d1ce9af
Closes-Bug: #1835817
2019-07-08 19:58:51 +00:00
Zuul
6d6aa27f50 Merge "Add Python 3 Train unit tests" 2019-07-08 17:24:43 +00:00
Zuul
772568e888 Merge "CI: add periodic-stable-jobs Zuul project template" 2019-07-08 09:46:41 +00:00
Zuul
14a51cb31d Merge "CI: Test ironic also when nova role is modified" 2019-07-08 09:23:17 +00:00
Zuul
65783c90dd Merge "CI: Pull images before upgrade" 2019-07-08 09:21:57 +00:00
Zuul
4fc523c3f4 Merge "Fixes for MariaDB bootstrap and recovery" 2019-07-08 09:21:55 +00:00
Zuul
ec78645928 Merge "Bump minimum Ansible version to 2.5" 2019-07-08 09:21:53 +00:00
Zuul
db55408620 Merge "Fix conditionals in CI playbook" 2019-07-07 10:52:01 +00:00
Corey Bryant
09b5738168 Add Python 3 Train unit tests
This is a mechanically generated patch to ensure unit testing is in place
for all of the Tested Runtimes for Train.

See the Train python3-updates goal document for details:
https://governance.openstack.org/tc/goals/train/python3-updates.html

Change-Id: Ic5f9c5c666e08bc34127d97f9540033536c5b08f
Story: #2005924
Task: #34216
2019-07-05 11:44:23 -04:00
Zuul
fb964ce41b Merge "CI - remove unused setup scripts" 2019-07-05 15:42:42 +00:00
Zuul
8daad1abcf Merge "Wait for all compute services before cell discovery" 2019-07-05 10:31:29 +00:00
Mark Goddard
86f373a198 Fixes for MariaDB bootstrap and recovery
* Fix wsrep sequence number detection. Log message format is
  'WSREP: Recovered position: <UUID>:<seqno>' but we were picking out
  the UUID rather than the sequence number. This is as good as random.

* Add become: true to log file reading and removal since
  I4a5ebcedaccb9261dbc958ec67e8077d7980e496 added become: true to the
  'docker cp' command which creates it.

* Don't run handlers during recovery. If the config files change we
  would end up restarting the cluster twice.

* Wait for wsrep recovery container completion (don't detach). This
  avoids a potential race between wsrep recovery and the subsequent
  'stop_container'.

* Finally, we now wait for the bootstrap host to report that it is in
  an OPERATIONAL state. Without this we can see errors where the
  MariaDB cluster is not ready when used by other services.

Change-Id: Iaf7862be1affab390f811fc485fd0eb6879fd583
Closes-Bug: #1834467
2019-07-05 09:20:34 +00:00
Zuul
dfa1a3844d Merge "Add upgrade-bifrost command" 2019-07-05 09:17:16 +00:00
Zuul
70b7cddd2b Merge "Add parameters to configure number of processes and threads of horizon" 2019-07-05 09:17:14 +00:00
Zuul
af8ae0aa41 Merge "Simplify handler conditionals" 2019-07-04 21:34:14 +00:00
Mark Goddard
f11d3c694a CI: Pull images before upgrade
This is the documented procedure.

Change-Id: I09ca99e92b112621d66b564a88b13658632242f5
2019-07-04 18:11:16 +00:00
Mark Goddard
e6d0e610c5 Deprecate Ceph deployment
There are now several good tools for deploying Ceph, including Ceph
Ansible and ceph-deploy. Maintaining our own Ceph deployment is a
significant maintenance burden, and we should focus on our core mission
to deploy OpenStack. Given that this is a significant part of kolla
ansible currently we will need a long deprecation period and a migration
path to another tool.

Change-Id: Ic603c85c04d8794580a19f9efaa7a8589565f4f6
Partially-Implements: blueprint remove-ceph
2019-07-04 19:05:54 +01:00
Christian Berendt
dc3489df18 Add parameters to configure number of processes and threads of horizon
Change-Id: Ib5490d504a5b7c9a37dda7babf1257aa661c11de
2019-07-04 17:23:50 +02:00
Mark Goddard
c38dd76711 Wait for all compute services before cell discovery
There is a race condition during nova deploy since we wait for at least
one compute service to register itself before performing cells v2 host
discovery.  It's quite possible that other compute nodes will not yet
have registered and will therefore not be discovered. This leaves them
not mapped into a cell, and results in the following error if the
scheduler picks one when booting an instance:

Host 'xyz' is not mapped to any cell

The problem has been exacerbated by merging a fix [1][2] for a nova race
condition, which disabled the dynamic periodic discovery mechanism in
the nova scheduler.

This change fixes the issue by waiting for all expected compute services
to register themselves before performing host discovery. This includes
both virtualised compute services and bare metal compute services.

[1] https://bugs.launchpad.net/kolla-ansible/+bug/1832987
[2] https://review.opendev.org/665554

Change-Id: I2915e2610e5c0b8d67412e7ec77f7575b8fe9921
Closes-Bug: #1835002
2019-07-04 13:03:12 +00:00
Radosław Piliszek
55d813004d CI: Test ironic also when nova role is modified
Change-Id: I9773a7c4f7a5d31a83c10562057ce772439b9693
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-04 12:46:40 +00:00
Zuul
26f2aecfa1 Merge "Don't rotate keystone fernet keys during deploy" 2019-07-04 10:18:28 +00:00
Zuul
56c3603586 Merge "CI: Keep stderr in ansible logs" 2019-07-04 07:45:54 +00:00
Zuul
2ad7b50010 Merge "Cloudkitty InfluxDB Storage backend via Kolla-ansible" 2019-07-04 03:45:40 +00:00
Zuul
b5babe0f39 Merge "CI: set the same gate queue for kolla and kolla-ansible" 2019-07-03 21:21:52 +00:00
Zuul
f95360d588 Merge "Update the UPPER_CONSTRAINTS_FILE to releases.openstack.org" 2019-07-03 19:28:15 +00:00
Zuul
6aba50e66a Merge "CI: Use template-overrides.j2 from kolla" 2019-07-03 19:22:21 +00:00
Radosław Piliszek
9c815fa503 CI: set the same gate queue for kolla and kolla-ansible
This is to ensure that any Depends-On does not cause Zuul not to pick up
the change for gating due to no notifications between queues.
Previously W+1-ing a change which depended on non-merged change from
the other project caused it to remain in the same state.

Change-Id: Ib2d88471ac5730c00b5a9721066d1fb3f2998c9c
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-03 18:04:12 +00:00
gujin
f41531851f Update the UPPER_CONSTRAINTS_FILE to releases.openstack.org
1. Update the UPPER_CONSTRAINTS_FILE to releases.openstack.org[1]
2. Blacklist sphinx 2.1.0[2]

[1]: http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006478.html
[2]: https://github.com/sphinx-doc/sphinx/issues/6440

Change-Id: Ie5f9ae1bd5c45617c6b7fde0e490d471e172c24e
2019-07-03 15:30:44 +00:00
Radosław Piliszek
fd0607dc47 Fix deploy guide build (missing kolla project reference)
Change-Id: I9e3650e83c72081ef2679fe01842bb9be6a4eb7c
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-03 09:02:38 +02:00
Radosław Piliszek
b9aa8b38f4 CI: Keep stderr in ansible logs
Otherwise ara had only the stderr part and logs only the
stdout part which made ordered analysis harder.

Additionally add -vvv for the bootstrap-servers run.

Change-Id: Ia42ac9b90a17245e9df277c40bda24308ebcd11d
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-02 20:44:33 +02:00
Rafael Weingärtner
97cb30cdd8 Cloudkitty InfluxDB Storage backend via Kolla-ansible
This proposal will add support to Kolla-Ansible for Cloudkitty
 InfluxDB storage system deployment. The feature of InfluxDB as the
 storage backend for Cloudkitty was created with the following commit
 https://github.com/openstack/cloudkitty/commit/
 c4758e78b49386145309a44623502f8095a2c7ee

Problem Description
===================

With the addition of support for InfluxDB in Cloudkitty, which is
achieving general availability via Stein release, we need a method to
easily configure/support this storage backend system via Kolla-ansible.

Kolla-ansible is already able to deploy and configure an InfluxDB
system. Therefore, this proposal will use the InfluxDB deployment
configured via Kolla-ansible to connect to CloudKitty and use it as a
storage backend.

If we do not provide a method for users (operators) to manage
Cloudkitty storage backend via Kolla-ansible, the user has to execute
these changes/configurations manually (or via some other set of
automated scripts), which creates distributed set of configuration
files, "configurations" scripts that have different versioning schemas
and life cycles.

Proposed Change
===============

Architecture
------------

We propose a flag that users can use to make Kolla-ansible configure
CloudKitty to use InfluxDB as the storage backend system. When
enabling this flag, Kolla-ansible will also enable the deployment of
the InfluxDB via Kolla-ansible automatically.

CloudKitty will be configured accordingly to [1] and [2]. We will also
externalize the "retention_policy", "use_ssl", and "insecure", to
allow fine granular configurations to operators. All of these
configurations will only be used when configured; therefore, when they
are not set, the default value/behavior defined in Cloudkitty will be
used. Moreover, when we configure "use_ssl" to "true", the user will
be able to set "cafile" to a custom trusted CA file. Again, if these
variables are not set, the default ones in Cloudkitty will be used.

Implementation
--------------
We need to introduce a new variable called
`cloudkitty_storage_backend`. Valid options are `sqlalchemy` or
`influxdb`. The default value in Kolla-ansible is `sqlalchemy` for
backward compatibility. Then, the first step is to change the
definition for the following variable:
`/ansible/group_vars/all.yml:enable_influxdb: "{{ enable_monasca |
bool }}"`

We also need to enable InfluxDB when CloudKitty is configured to use
it as the storage backend. Afterwards, we need to create tasks in
CloudKitty configurations to create the InfluxDB schema and configure
the configuration files accordingly.

Alternatives
------------
The alternative would be to execute the configurations manually or
handle it via a different set of scripts and configurations files,
which can become cumbersome with time.

Security Impact
---------------
None identified by the author of this spec

Notifications Impact
--------------------
Operators that are already deploying CloudKitty with InfluxDB as
storage backend would need to convert their configurations to
Kolla-ansible (if they wish to adopt Kolla-ansible to execute these
tasks).

Also, deployments (OpenStack environments) that were created with
Cloudkitty using storage v1 will need to migrate all of their data to
V2 before enabling InfluxDB as the storage system.

Other End User Impact
---------------------
None.

Performance Impact
------------------
None.

Other Deployer Impact
---------------------
New configuration options will be available for CloudKitty.
* cloudkitty_storage_backend
* cloudkitty_influxdb_retention_policy
* cloudkitty_influxdb_use_ssl
* cloudkitty_influxdb_cafile
* cloudkitty_influxdb_insecure_connections
* cloudkitty_influxdb_name

Developer Impact
----------------
None

Implementation
==============

Assignee
--------
* `Rafael Weingärtner <rafaelweingartne>`

Work Items
----------
 * Extend InfluxDB "enable/disable" variable
 * Add new tasks to configure Cloudkitty accordingly to these new
 variables that are presented above
 * Write documentation and release notes

Dependencies
============
None

Documentation Impact
====================
New documentation for the feature.

References
==========
[1] `https://docs.openstack.org/cloudkitty/latest/admin/configuration/storage.html#influxdb-v2`
[2] `https://docs.openstack.org/cloudkitty/latest/admin/configuration/collector.html#metric-collection`

Change-Id: I65670cb827f8ca5f8529e1786ece635fe44475b0
Signed-off-by: Rafael Weingärtner <rafael@apache.org>
2019-07-02 11:14:05 -03:00
Mark Goddard
9cac1137d0 Add upgrade-bifrost command
This performs the same as a deploy-bifrost, but first stops the
bifrost services and container if they are running.

This can help where a docker stop may lead to an ungraceful shutdown,
possibly due to running multiple services in one container.

Change-Id: I131ab3c0e850a1d7f5c814ab65385e3a03dfcc74
Implements: blueprint bifrost-upgrade
Closes-Bug: #1834332
2019-07-02 14:30:14 +01:00
Zuul
8b1e637905 Merge "Specify endpoint when creating monasca user" 2019-07-02 09:01:46 +00:00
Radosław Piliszek
20ab480ca5 CI: Use template-overrides.j2 from kolla
Some kolla-ansible jobs failed due to using external mirrors
instead of local ones.
This was due to not using the template override provided by kolla.
This patch fixes that.

Depends-On: https://review.opendev.org/668226
Change-Id: I27f714fdf05e521aa8ce25c5683a452ceb35eeb8
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-01 17:00:53 +00:00
Radosław Piliszek
a0bdc3669a Add note to CI config regarding registry during upgrade
Change-Id: Ifc898015b9b523ef4c50fc969e464f05762f2151
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-01 18:45:30 +02:00