180 Commits

Author SHA1 Message Date
Michal Nasiadka
1009931162 Change local_action to delegate_to: localhost
As part of the effort to implement Ansible code linting in CI
(using ansible-lint) - we need to implement recommendations from
ansible-lint output [1].

One of them is to stop using local_action in favor of delegate_to -
to increase readability and and match the style of typical ansible
tasks.

[1]: https://review.opendev.org/694779/

Partially implements: blueprint ansible-lint

Change-Id: I46c259ddad5a6aaf9c7301e6c44cd8a1d5c457d3
2019-11-22 15:04:44 +00:00
Michal Nasiadka
2585788982 Use versioned python binary with fetch ceph keyrings
Depends-On: https://review.opendev.org/688636/

Change-Id: I9918ff6a91acde2a7d184e44b8a1014462596e39
2019-10-18 12:00:22 +02:00
Radosław Piliszek
bc053c09c1 Implement IPv6 support in the control plane
Introduce kolla_address filter.
Introduce put_address_in_context filter.

Add AF config to vars.

Address contexts:
- raw (default): <ADDR>
- memcache: inet6:[<ADDR>]
- url: [<ADDR>]

Other changes:

globals.yml - mention just IP in comment

prechecks/port_checks (api_intf) - kolla_address handles validation

3x interface conditional (swift configs: replication/storage)

2x interface variable definition with hostname
(haproxy listens; api intf)

1x interface variable definition with hostname with bifrost exclusion
(baremetal pre-install /etc/hosts; api intf)

neutron's ml2 'overlay_ip_version' set to 6 for IPv6 on tunnel network

basic multinode source CI job for IPv6

prechecks for rabbitmq and qdrouterd use proper NSS database now

MariaDB Galera Cluster WSREP SST mariabackup workaround
(socat and IPv6)

Ceph naming workaround in CI
TODO: probably needs documenting

RabbitMQ IPv6-only proto_dist

Ceph ms switch to IPv6 mode

Remove neutron-server ml2_type_vxlan/vxlan_group setting
as it is not used (let's avoid any confusion)
and could break setups without proper multicast routing
if it started working (also IPv4-only)

haproxy upgrade checks for slaves based on ipv6 addresses

TODO:

ovs-dpdk grabs ipv4 network address (w/ prefix len / submask)
not supported, invalid by default because neutron_external has no address
No idea whether ovs-dpdk works at all atm.

ml2 for xenapi
Xen is not supported too well.
This would require working with XenAPI facts.

rp_filter setting
This would require meddling with ip6tables (there is no sysctl param).
By default nothing is dropped.
Unlikely we really need it.

ironic dnsmasq is configured IPv4-only
dnsmasq needs DHCPv6 options and testing in vivo.

KNOWN ISSUES (beyond us):

One cannot use IPv6 address to reference the image for docker like we
currently do, see: https://github.com/moby/moby/issues/39033
(docker_registry; docker API 400 - invalid reference format)
workaround: use hostname/FQDN

RabbitMQ may fail to bind to IPv6 if hostname resolves also to IPv4.
This is due to old RabbitMQ versions available in images.
IPv4 is preferred by default and may fail in the IPv6-only scenario.
This should be no problem in real life as IPv6-only is indeed IPv6-only.
Also, when new RabbitMQ (3.7.16/3.8+) makes it into images, this will
no longer be relevant as we supply all the necessary config.
See: https://github.com/rabbitmq/rabbitmq-server/pull/1982

For reliable runs, at least Ansible 2.8 is required (2.8.5 confirmed
to work well). Older Ansible versions are known to miss IPv6 addresses
in interface facts. This may affect redeploys, reconfigures and
upgrades which run after VIP address is assigned.
See: https://github.com/ansible/ansible/issues/63227

Bifrost Train does not support IPv6 deployments.
See: https://storyboard.openstack.org/#!/story/2006689

Change-Id: Ia34e6916ea4f99e9522cd2ddde03a0a4776f7e2c
Implements: blueprint ipv6-control-plane
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-10-16 10:24:35 +02:00
Mark Goddard
3522d235bd Refactor service, endpoint and user registration
Use upstream Ansible modules for registration of services, endpoints,
users, projects, roles, and role grants.

Change-Id: I7c9138d422cc91c177fd8992347176bb54156b5a
2019-09-17 10:13:56 -07:00
Zuul
42aef5a50f Merge "Support configuration of trusted CA certificate file" 2019-08-28 07:48:51 +00:00
Michal Nasiadka
361f61d4a9 Add --force to ceph mgr dashboard enablement
Sometimes mgr dashboard enablement fails with following message:
"Error ENOENT: all mgr daemons do not support module 'dashboard',
pass --force to force enablement"

Change-Id: Ie7052dbdccb855e02da849dbc207b5d1778e2c82
2019-08-21 14:31:45 +00:00
Scott Solkhon
09e02ef8f1 Support configuration of trusted CA certificate file
This commit adds the functionality for an operator to specify
their own trusted CA certificate file for interacting with the
Keystone API.

Implements: blueprint support-trusted-ca-certificate-file
Change-Id: I84f9897cc8e107658701fb309ec318c0f805883b
2019-08-16 12:47:42 +00:00
Michal Nasiadka
ad9e8786a3 Add support for sha256 in ceph key distribution
- add support for sha256 in bslurp module
- change sha1 to sha256 in ceph-mon ansible role

Depends-On: https://review.opendev.org/655623
Change-Id: I25e28d150f2a8d4a7f87bb119d9fb1c46cfe926f
Closes-Bug: #1826327
2019-08-07 11:57:46 +00:00
Zuul
3731da0b79 Merge "Add mon address to ceph release version check" 2019-08-06 17:04:13 +00:00
Zuul
9a652b29e5 Merge "Support mon and osd to be named with hostname" 2019-08-06 13:59:08 +00:00
Zuul
418e9cccc7 Merge "ceph: fixes to deployment and upgrade" 2019-08-06 13:59:06 +00:00
wangwei
cd519db139 Support mon and osd to be named with hostname
In the current deployment of ceph, the node name of osd and the name
of mon are both IP, and other daemons use hostname.

This commit adds support for naming mon and osd nodes using hostname,
and does not change the default ip-named way.

Change-Id: I22bef72dcd8fc8bcd391ae30e4643520250fd556
2019-08-05 08:54:01 +00:00
Radosław Piliszek
826f6850d0 ceph: fixes to deployment and upgrade
1) ceph-nfs (ganesha-ceph) - use NFSv4 only
This is recommended upstream.
v3 and UDP require portmapper (aka rpcbind) which we
do not want, except where Ubuntu ganesha version (2.6)
forces it by requiring enabled UDP, see [1].
The issue has been fixed in 2.8, included in CentOS.
Additionally disable v3 helper protocols and kerberos
to avoid meaningless warnings.

2) ceph-nfs (ganesha-ceph) - do not export host dbus
It is not in use. This avoids the temptation to try
handling it on host.

3) Properly handle ceph services deploy and upgrade
Upgrade runs deploy.
The order has been corrected - nfs goes after mds.
Additionally upgrade takes care of rgw for keystone
(for swift emulation).

4) Enhance ceph keyring module with error detection
Now it does not blindly try to create a keyring after
any failure. This used to hide real issue.

5) Retry ceph admin keyring update until cluster works
Reordering deployment caused issue with ceph cluster not being
fully operational before taking actions on it.

6) CI: Remove osd df from collected logs as it may hang CI
Hangs are caused by healthy MON and no healthy MGR.
A descriptive note is left in its place.

7) CI: Add 5s timeout to ceph informational commands
This decreases the timeout from the default 300s.

[1] https://review.opendev.org/669315

Change-Id: I1cf0ad10b80552f503898e723f0c4bd00a38f143
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-08-05 06:26:25 +00:00
Radosław Piliszek
6a737b1968 Fix handling of docker restart policy
Docker has no restart policy named 'never'. It has 'no'.
This has bitten us already (see [1]) and might bite us again whenever
we want to change the restart policy to 'no'.

This patch makes our docker integration honor all valid restart policies
and only valid restart policies.
All relevant docker restart policy usages are patched as well.

I added some FIXMEs around which are relevant to kolla-ansible docker
integration. They are not fixed in here to not alter behavior.

[1] https://review.opendev.org/667363

Change-Id: I1c9764fb9bbda08a71186091aced67433ad4e3d6
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-18 13:39:06 +00:00
Michal Nasiadka
4e3054b5da Add 'allow *' to getting ceph mds keyring
* Sometimes getting/creating ceph mds keyring fails, similar to https://tracker.ceph.com/issues/16255

Change-Id: I47587cbeb8be0e782c13ba7f40367409e2daa8a8
2019-07-10 13:09:38 +02:00
Michal Nasiadka
b2c17d6051 Add mon address to ceph release version check
Change-Id: Ia4801d09ff1165c44561fd286fc57e903da2c602
2019-07-05 07:04:26 +00:00
Mark Goddard
e6d0e610c5 Deprecate Ceph deployment
There are now several good tools for deploying Ceph, including Ceph
Ansible and ceph-deploy. Maintaining our own Ceph deployment is a
significant maintenance burden, and we should focus on our core mission
to deploy OpenStack. Given that this is a significant part of kolla
ansible currently we will need a long deprecation period and a migration
path to another tool.

Change-Id: Ic603c85c04d8794580a19f9efaa7a8589565f4f6
Partially-Implements: blueprint remove-ceph
2019-07-04 19:05:54 +01:00
Radosław Piliszek
0ea991e4d2 Make Ceph upgrade check Ceph release to avoid EPERM
Since we have different upgrade paths, we must use the actually
installed Ceph release name when doing require-osd-release

Closes-Bug: #1832989

Change-Id: I6aaa4b4ac0fb739f7ad885c13f55b6db969996a2
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-06-18 11:35:43 +02:00
Mark Goddard
b123bf6621 Use become for all docker tasks
Many tasks that use Docker have become specified already, but
not all. This change ensures all tasks that use the following
modules have become:

* kolla_docker
* kolla_ceph_keyring
* kolla_toolbox
* kolla_container_facts

It also adds become for 'command' tasks that use docker CLI.

Change-Id: I4a5ebcedaccb9261dbc958ec67e8077d7980e496
2019-06-06 19:04:58 +01:00
Mark Goddard
a4bb8567da Fix up config file permissions on the host
Several config file permissions are incorrect on the host. In general,
files should be 0660, and directories and executables 0770.

Change-Id: Id276ac1864f280554e98b937f2845bb424d521de
Closes-Bug: #1821579
2019-04-02 17:23:31 +01:00
Jim Rollenhagen
2e4e60503a Use keystone_*_url var in all configs
We're duplicating code to build the keystone URLs in nearly every
config, where we've already done it in group_vars. Replace the
redundancy with a variable that does the same thing.

Change-Id: I207d77870e2535c1cdcbc5eaf704f0448ac85a7a
2019-03-06 15:08:26 -05:00
wu.chunyang
cdfc0442e9 add debug option to ceph mon or osd start command
when ceph_mon and ceph_osd start failed, add debug option will
print more info. now when ceph_mon and ceph_osd containers start
failed, docker logs ceph_mon print none log

Closes-Bug: #1815707

Change-Id: I3c5086019808a9738714f5279ec74cbb9b7a8587
2019-02-14 11:28:53 +00:00
wu.chunyang
d35f9a4b70 repair ceph_nfs container start failed
when enable ceph_nfs,it deploy failed, because no ganesha config
file, and the 'ganesha.nfs' command need root privilege to run.
i will modify ceph_nfs dockerfile,please review. thanks

https://review.openstack.org/#/c/630510/

Change-Id: I347107bc33733061ad043bffe38ecc1d16770afc
Closes-Bug: #1811581
2019-01-17 23:43:03 +08:00
Eduardo Gonzalez
1a682fab28 Support stop specific containers
With this change, an operator may be able to stop a
service container without stopping all services in a host.
This change is the starting point to start
fast-forward upgrades support.
In next changes new flags will be introducced to disable
stop dataplane services during upgrades.

Change-Id: Ifde7a39d7d8596ef0d7405ecf1ac1d49a459d9ef
Implements: blueprint support-stop-containers
2018-11-26 08:07:01 +00:00
Jeffrey Zhang
6db3f9f342 Disable ceph osd crush update on start in default
The buggy come from ceph changes[0], which is included since ceph osd
v11.0.0. The `osd crush update on start` logical is moved from
`ceph-osd-prestart.sh` to ceph-osd startup process. So ceph-osd will
create buckets by node hostname automatically. Whereas, kolla is
creating buckets by node ip

For the less confused and ceph upgrade impact, disable `osd crush update
on start` is a better choice

[0] a28b71e3c9

Change-Id: Ibbeac9505c9957319126267dbe6bd7a2cac11f0c
Closes-Bug: #1801662
2018-11-05 15:11:05 +08:00
Adam Harwell
f1c8136556 Refactor haproxy config (split by service) V2.0
Having all services in one giant haproxy file makes altering
configuration for a service both painful and dangerous. Each service
should be configured with a simple set of variables and rendered with a
single unified template.

Available are two new templates:

* haproxy_single_service_listen.cfg.j2: close to the original style, but
only one service per file
* haproxy_single_service_split.cfg.j2: using the newer haproxy syntax
for separated frontend and backend

For now the default will be the single listen block, for ease of
transition.

Change-Id: I6e237438fbc0aa3c89a3c8bd706a53b74e71904b
2018-09-26 03:30:38 -07:00
Zuul
6fca49ab51 Merge "Fix bluestore disk naming format in kolla-ansible" 2018-09-03 06:00:30 +00:00
wangwei
4e5e28fff5 Fix bluestore disk naming format in kolla-ansible
The current bluestore disk label naming is inconsistent with the
filestore. The filestore naming format is that the disk prefixes
belonging to the same osd are the same and the suffixes are
different.

This patch keeps the bluestore's disk naming as well.

Change-Id: I71dda29fc4a6765300ce7bb173d2c448c24f6eca
2018-08-31 09:55:09 +09:00
Zuul
cfee876895 Merge "[prometheus] Enable ceph mgr exporter" 2018-08-30 07:09:48 +00:00
Xinliang Liu
943e41d2cb Add ResellerAdmin role for ceph-rgw
ResellerAdmin role is used to give users object storage administration role
in their projects.

It is required to pass object storage quotas tests[1] of DefCore (OpenStack
Powered) certification test suite.

[1] tempest.api.object_storage.test_account_quotas*
Related-Bug: #1700729

Change-Id: Id976827aa7da271e54b77476f175f06bd1a00cc8
2018-08-08 14:10:10 +08:00
Zuul
e71df7dbae Merge "Enable rgw_swift_enforce_content_length" 2018-08-02 11:44:03 +00:00
Xinliang Liu
815c6b7589 Enable rgw_swift_enforce_content_length
Currently test_list_containers tempest tests[1] would be failed.
It is becuase accept-ranges header does not exist. See ceph bug[2].

Rgw_swift_enforce_content_length assures Content-Length and
Accept-Ranges in dynamically generated account & container listings.

[1] tempest.api.object_storage.test_account_services.AccountTest.test_list_containers
[2] http://tracker.ceph.com/issues/21554
Related-Bug: #1783456

Change-Id: I9b5fcc361f0bc0e521302d2df1974aabf6f4a7e7
2018-08-02 16:56:30 +08:00
Xinliang Liu
d37d050e60 Allow object versioning for ceph-rgw
Object versioning test[1] is required for RefStack test suite.
Swift has enabled it by default[2].
It is also needed for ceph-rgw.

[1]
tempest.api.object_storage.test_object_version.ContainerTest.test_versioned_container
[2] https://review.openstack.org/#/c/517281/

Related-Bug: #1729583
Change-Id: If89636f77d87bab75e8e7bcf16cc784e83184bc6
2018-07-30 16:45:40 +08:00
wu.chunyang
da9ff22461 Use include_tasks instead of include
last patch have replaced include by include_tasks, but here have a
 omission

Change-Id: Ibfe2918eb5504bb5355489ab093200feb1d221d7
2018-07-27 22:58:21 +08:00
Zuul
3e45b2cbec Merge "Use include_tasks instead of include" 2018-07-27 08:16:08 +00:00
Mark Goddard
07b64dedc1 Fix ceph role with ansible < 2.4
The include_tasks action was added in ansible 2.4.

Change-Id: Ieac4a39a95c6aa55754c9dde5e94fb293c103caa
Related-Bug: #1783456
2018-07-25 20:57:23 +01:00
Jeffrey Zhang
b51eeed89e Use include_tasks instead of include
include is marked as deprecated since ansible 2.4[0]

[0] https://docs.ansible.com/ansible/2.4/include_module.html#deprecated

Co-Authored-By: confi-surya <singh.surya64mnnit@gmail.com>
Change-Id: Ic9d71e1865d1c728890625aeddf424a5734c0a8a
2018-07-25 23:57:22 +08:00
tone.zhang
2ce46e4767 Improve ceph-rgw compatibility with Swift API in Kolla-ansible
By default ceph-rgw is not completely comaptible with Swift API,
because of the restriction for Swift INFO API.[0]

The patch improve ceph-rgw compatibility with Swift API. It is
controlled by the option "ceph_rgw_compatibility" in
ansible/group_vars/all.yml.

After changing the option, run the "reconfigure" command to enable.

Closes-Bug: #1783456

[0] https://github.com/ceph/ceph/pull/17967

Change-Id: Ibf3eb52280e197965caef08a44ae226c4f884cb5
Signed-off-by: tone.zhang <tone.zhang@arm.com>
2018-07-25 18:09:23 +08:00
Jorge Niedbalski
9d2770db11 [prometheus] Enable ceph mgr exporter
This patch enables the ceph mgr prometheus exporter.

If enable_prometheus_ceph_mgr_exporter is set to true,
the ceph mgr prometheus plugin is enabled on the hosts that are part
of the ceph-mgr group, then the exporter is added into the prometheus-server
configuration file.

Change-Id: Ia2f879401e585e6043f69cc5e3ab1a1f72f7f033
2018-07-23 05:39:52 +00:00
Jeffrey Zhang
3397668d10 Migrate ceph keyring creation to kolla_ceph_keyring module
In this way, keyring caps is updatable.

Change-Id: Idf7f222645b5073e2c72d59eecf3d47b3f1dc6ba
2018-07-02 09:49:48 +08:00
Zuul
949f1c2c09 Merge "Allow Kolla Ceph to deploy bluestore OSDs in Kolla-ansible" 2018-06-26 08:34:29 +00:00
Zuul
ab5fd56bb0 Merge "Enable ceph dashboard by default" 2018-06-22 06:19:08 +00:00
Tone Zhang
3591d0fa9f Allow Kolla Ceph to deploy bluestore OSDs in Kolla-ansible
Support Kolla Ceph to deploy blustore OSDs with Kolla-ansible.

Please refer to [1] for bluestore OSD configuration

The patch includes:
1. Set Ceph OSD store type group_vars/all.yml. The default value
is "bluestore" in Rocky.

2. Make Kolla Ceph to deploy bluestore OSDs with Kolla-ansible

3. Update gate test configuration for Ceph bluestore OSD test

[1]: specs/kolla-ceph-bluestore.rst

Partially-Implements: blueprint kolla-ceph-bluestore
Depends-On: I00eaa600a5e9ad4c1ebca2eeb523bca3d7a25128
Change-Id: I14f20a00654dff32c36d078ebb9005d91a3e60b2
Signed-off-by: Tone Zhang <tone.zhang@arm.com>
2018-06-19 11:13:38 +00:00
chenxing
fd6c9f3882 Enable ceph dashboard by default
Co-Authored-By: rhcayadav <rhcayadav@gmail.com>

Change-Id: I3c2c56decbb9de86101f45592ba8135c49c49405
Closes-Bug: #1754424
2018-06-15 10:25:41 +05:30
Zuul
edd22f8dba Merge "Fix usage of openstack_ceph_rgw_auth" 2018-06-11 14:13:42 +00:00
Ha Manh Dong
30be04ea91 Specify 'become' for all tasks that use kolla_docker module
Add become to all tasks that use the module "kolla_docker"

Change-Id: I4309c4011687b88ec31d739fd8f834fe2326ff10
Partial-Implements: blueprint ansible-specific-task-become
2018-06-08 12:39:24 +00:00
Jorge Niedbalski
640dd55e06 Fix usage of openstack_ceph_rgw_auth
Patch [0] left 2 variables for authentication one is
openstack_swift_auth and the other (inexistent) openstack_ceph_rgw_auth
for the ceph_rgw start_keystone task.

This patch leaves only openstack_ceph_rgw_auth.

Closes-Bug: #1769463

[0] 84ade4e149

Change-Id: I1cc522d91f8258f4ca23afc10a0a2a2b35c1ff68
Signed-off-by: Jorge Niedbalski <jorge.niedbalski@linaro.org>
2018-06-06 23:29:31 +00:00
chenxing
ab79c3ee28 Fix the ceph warning after upgrade to luminous
Change-Id: Ia94c10ca8292d803bc20650fb1d496002455338f
Closes-Bug: #1771968
2018-06-06 10:36:20 +08:00
wangwei
5da1cb0b5e Fix the permissions of mgr and mds keyring
Change-Id: I6d1e6d7dc21eaf6051c89b467cd6d886d8e3c469
2018-05-15 10:13:24 +09:00
Jeffrey Zhang
c567055176 Fix ansible warning
- rename action and serial to kolla_ansible and kolla_serial
- use become instead of "sudo <command>" in shell
- Remove quota for failed_when and changed_when in rabbitmq tasks

Change-Id: I78cb60168aaa40bb6439198283546b7faf33917c
Implements: blueprint migrate-to-ansible-2-2-0
2018-05-11 02:54:02 +00:00