871 Commits

Author SHA1 Message Date
Meg Heisler
774e0cb654 RGW: Fix multinode deploy for ceph rgw
Change deployment script for rgw to not use the docker
bridge for public and cluster network overrides. Instead,
calculate network values in same way as other ceph multinodes
deployment steps

Change-Id: I2bacd1af1cc331d76a5d61f3b589ca6ef80b1b2e
2018-11-08 11:39:23 -06:00
Zuul
7274c5f95f Merge "Revert "Fix rally deployment config to rally 1.2.0"" 2018-11-07 22:26:22 +00:00
Zuul
47d49bcfd4 Merge "prometheus ceph.rules changes" 2018-11-07 20:51:42 +00:00
Pete Birley
b7e77dfea0 Revert "Fix rally deployment config to rally 1.2.0"
This reverts commit 5c2859c3e9026e464bf0c35b591aaae810ff2a1c.

This commit breaks the ability to declare users to use with rally/helm test - and needs to be refactored to match the commit message's intent.

Change-Id: I2bc66ef40694c277058b4324b8a3528f4f25d1d1
2018-11-07 19:31:49 +00:00
Zuul
b28aed8331 Merge "Fix rally deployment config to rally 1.2.0" 2018-11-07 14:12:32 +00:00
Zuul
fca344900f Merge "Enable the mgr balancer module by default." 2018-11-02 22:36:13 +00:00
Steve Wilkerson
69196031cd Nagios: Ensure processes are reaped
This moves Nagios to run as child processes of either
the pause container or use the hosts init system (for k8s <1.10)
to prevent defunct process sprawl

Change-Id: I6a93d446577674b0b012f9567d5e6a5794ebc44b
2018-11-02 08:12:24 -05:00
Matthew Heler
a79562a28b Enable the mgr balancer module by default.
The balancer module will distribute PGs more evenly across OSDs. 
While CRUSH does a good job at this, it is not perfect and hot spots
(where an OSD has more PGs then it's peers) can occur.

Change-Id: Ic45a6bf745bdd09a3f5782e9e8bda89c3d3da2aa
2018-11-01 15:52:51 +00:00
inspurericzhang
f1c2bf976f [Trivial Fix] modify spelling error of "resource"
Although it is spelling mistakes, it affects reading.

Change-Id: I75a1f66002ec46fe206f31fec02fbd47f9cee443
2018-11-01 09:52:04 +08:00
kranthi guttikonda
fac358a575 prometheus ceph.rules changes
With new ceph luminous ceph.rules are obsolete.

Added a new rule for ceph-mgr count

Changed ceph_monitor_quorum_count to ceph_mon_quorum_count

Updated ceph_cluster_usage_highas ceph_cluster_used_bytes,
ceph_cluster_capacity_bytes aren't valid

Updated ceph_placement_group_degrade_pct_high as
ceph_degraded_pgs, ceph_total_pgs aren't valid

Updated ceph_osd_down_pct_high as ceph_osds_down,
ceph_osds_up aren't available, ceph_osd_up is
available but ceph_osd_down isn't. Need to
calculate the down based on count(ceph_osd_up==0)
and total osd using count(ceph_osd_metadata)

Removed ceph_monitor_clock_skew_high as the metric
ceph_monitor_clock_skew_seconds isn't  valid anymore

Added new alarms ceph_osd_down, ceph_osd_out

Implements: prometheus ceph.rules changes with new valid metrics
Closes-Bug: #1800548
Change-Id: Id68e64472af12e8dadffa61373c18bbb82df96a3
Signed-off-by: Kranthi Guttikonda <kranthi.guttikonda@b-yond.com>
2018-10-31 10:23:11 -04:00
Matthew Heler
3e7ba37290 Ensure latest Ceph packages during deployment
Change-Id: Ia5bc0802577e2b72a1de078085f5fe7e60f63604
2018-10-31 02:16:50 -05:00
Tin Lam
5730631ba6 Clean-up script
This patch set cleans up the script to be consistent with other OSH
installation scripts.

Change-Id: I212cd0cf0e818f1fc924b9b690d18f5d107b850b
Signed-off-by: Tin Lam <tin@irrational.io>
2018-10-30 16:22:45 +00:00
Zuul
31a9bb6ad4 Merge "[gate] Use Kubernetes 1.10.9" 2018-10-30 08:05:08 +00:00
Steve Wilkerson
45da8c2b69 Ceph: Update log directory host mount path
This updates the ceph-mon and ceph-osd charts to use the release
name for the hostpath defined for mounting the /var/log/ceph
directories to. This gives us a mechanism for creating unique log
directories for multiple releases of the same chart without the
need for specifying an override for each deployment of that chart

Change-Id: Ie6e05b99c32f24440fbade02d59c7bb14d8aa4c8
2018-10-29 13:05:46 -05:00
Chris Wedgwood
b10ebbb63a [gate] Use Kubernetes 1.10.9
Change-Id: I5bb951f455fa6d7d344a264336a2a9b985fd85f4
2018-10-29 15:10:35 +00:00
Matthew Heler
6ef48d3706 Further performance tuning changes for Ceph
- Throttle down snap trimming as to lessen it's performance impact
(Setting just osd_snap_trim_priority isn't effective enough to throttle
down the impact)
osd_snap_trim_sleep: 0.1 (default 0)
osd_pg_max_concurrent_snap_trims: 1 (default 2)

- Align filestore_merge_threshold with upstream Ceph values
(A negative number disables this function, no change in behavior)
filestore_merge_threshold: -10 (formerly -50, default 10)

- Increase RGW pool thread size for more concurrent connections
rgw_thread_pool_size: 512 (default 100)

- Disable in-memory logs for the ms subsytem.
debug_ms: 0/0 (default 0/5)

- Formating cleanups

Change-Id: I4aefcb6e774cb3e1252e52ca6003cec495556467
2018-10-26 15:10:50 +00:00
Zuul
62f49e7c74 Merge "Define OSH_PATH by default" 2018-10-26 11:35:11 +00:00
Zuul
3e62b48036 Merge "Node-Exporter: allows to set collectors enable/disable" 2018-10-26 08:01:40 +00:00
Zuul
2e239086e4 Merge "Restrict libvirt Ceph access scope to what is needed only." 2018-10-26 06:36:49 +00:00
Zuul
1ca39def84 Merge "ceph: make log directory configurable" 2018-10-26 02:41:15 +00:00
Jawon Choo
b4dfb27f0c Node-Exporter: allows to set collectors enable/disable
This PS allows to set collectors enable/disable using values.
_node-exporter.sh.tpl makes collectors-list from values.yaml.

Change-Id: Iba2cf4d8304f2405db394fbb6fee58119eab13fc
2018-10-26 01:15:15 +00:00
Jean-Charles Lopez
566a489bbe Restrict libvirt Ceph access scope to what is needed only.
Change-Id: I78bffe6764e9cbb16b2a615be766c910ba5d4e48
2018-10-26 01:15:12 +00:00
Jean-Philippe Evrard
52f41c0af0 Define OSH_PATH by default
OSH_PATH is not defined by default outside OpenStack's CI.

This is a problem if a user wants to run scripts manually on its
machine for local testing.

This fixes it by having, by default, the OSH_PATH defined
in the scripts using OSH relatively to current folder.

For user experience, the script returns to the same path after
running.

Change-Id: I915e7d3c945f2002a2008b2b033a2b7725320b17
2018-10-26 01:15:08 +00:00
Zuul
4835aa637a Merge "MariaDB: Galera cluster refactor" 2018-10-25 16:32:56 +00:00
Chinasubbareddy M
a1b8f394b2 ceph: make log directory configurable
this is make log directory configurable incase if  another mon or
osd running on same host can point to other directory

Change-Id: I2db6dffd45599386f8082db8f893c799d139aba3
2018-10-25 14:34:14 +00:00
Zuul
1ec8981aa4 Merge "MariaDB: Move to use mariabackup instead of xtrabackup-v2" 2018-10-25 13:44:16 +00:00
Matthew Heler
f8ac6c3f21 ceph co-location journal and permission fixes
Support co-located journals with Ceph helm chart
Ensure proper ownership set on OSD/Journal disks

Change-Id: Ic954d75c8bd7532991dc9b3184ad6d74b97855d1
2018-10-25 08:21:31 +00:00
Pete Birley
f6e84fe15f MariaDB: Galera cluster refactor
This PS updates the MariaDB chart to better support clustering,
using a configmap to track cluster state.

Change-Id: Ifd9c3d63353a9b587384b6f13c0863ecc4fbd956
Signed-off-by: Pete Birley <pete@port.direct>
2018-10-25 06:21:01 +00:00
Pete Birley
8bc03bf88c MariaDB: Move to use mariabackup instead of xtrabackup-v2
This PS moves to use mariabackup instead of xtrabackup-v2, for info
see:
 * https://mariadb.com/kb/en/library/upgrading-from-mariadb-102-to-mariadb-103/#mariadb-backup-and-percona-xtrabackup
 * https://mariadb.com/kb/en/library/mariabackup-overview/#about-mariabackup

Additionally the readyness script is updated to match the order of
validation tests described in the mariadb/galera documentation.

Change-Id: I031c63d6305f1514ffdd53d77d621bc7edc0e68c
Signed-off-by: Pete Birley <pete@port.direct>
2018-10-25 05:43:59 +00:00
Steve Wilkerson
8d6cfd72d0 Nagios: Remove Nagios log monitors
This removes the checks for Nagios to query Elasticsearch for
logged events. The current plugin in the image is resulting in
unstable behavior, and should be removed until this plugins been
improved

Change-Id: If1bdd954956f063ac1eebbb94d1128df8b8d2695
2018-10-25 05:21:22 +00:00
Tin Lam
653b84a2e1 Fix k8s-auth job
This patch set addresses a cross-repo conflict with the enablement of
network policy in gate script override.

Change-Id: I284d6b04940424a87e5b239ccc9d30ae01075f38
Signed-off-by: Tin Lam <tin@irrational.io>
2018-10-24 20:49:17 -05:00
Pete Birley
1144ccbbb2 Ceph: Update MGR check to allow use on hosts with fqdns defined
This PS updates the mgr check to allow use on hosts with fqdns
defined.

Change-Id: If1cb740e8093fbcafce846234c96db931409b436
Signed-off-by: Pete Birley <pete@port.direct>
2018-10-24 00:57:12 +00:00
Zuul
eabf53253f Merge "ceph-mgr: make prometheus module port configurable" 2018-10-23 23:19:09 +00:00
Chinasubbareddy M
e23e372120 ceph-mgr: make prometheus module port configurable
this is to give example for prometheus module port configurable

Change-Id: I66844bb8ee59a58f7bfd3e3002a183779810e881
2018-10-23 15:40:43 -05:00
Zuul
5c446bb2d3 Merge "Use supplied HELM variable for dep up in Makefile" 2018-10-23 20:36:22 +00:00
Zuul
f49461acc4 Merge "cronjob-checkPGs failure fix" 2018-10-23 20:21:46 +00:00
Zuul
860a897aee Merge "[gate] allow pip caching" 2018-10-23 18:30:20 +00:00
Zuul
4c4e947e17 Merge "Ceph: A script to check object replication across the hosts" 2018-10-23 18:25:43 +00:00
Zuul
1e3693f1a3 Merge "[gate] Put nfs-provisioner in it's own namespace (docker-registry)" 2018-10-23 18:22:01 +00:00
Zuul
bad8427b21 Merge "[gate] Put nfs-provisioner in it's own namespace" 2018-10-23 18:22:00 +00:00
Zuul
11ec46bdce Merge "Prometheus kubelet.rules change" 2018-10-23 17:57:26 +00:00
Bryan Strassner
dacb01c82a Use supplied HELM variable for dep up in Makefile
Updates the helm dep up command to use the $(HELM) variable instead of
the locally istalled helm for the host machine. This bring this line of
code in alignment with the other uses of helm in the same Makefile.

Change-Id: I91bfdceedd3bac0ac49daf5b9410c05e0e840168
2018-10-23 11:26:16 -05:00
Zuul
a0d58decff Merge "[Calico] Allow resource configuration using chart (overrides)" 2018-10-22 22:49:08 +00:00
Zuul
19e7e0fb61 Merge "Use the correct socket file for the Ceph mon check." 2018-10-22 20:03:47 +00:00
Chris Wedgwood
02f400e442 [Calico] Allow resource configuration using chart (overrides)
Allow Calico resources such as NetworkPolicy, GlobalNetworkPolicy,
WorkloadEndpoint, etc to be specified using values.

To avoid the complexities of list management with helm we use a
dictionary that contains a relative priority and set of objects
(called rules).

For example:

network:
  policy:

    someName:
      priority: 0
      rules:
       - apiVersion: projectcalico.org/v3
... some useful resource object ...
       - apiVersion: projectcalico.org/v3
... some other useful resource object ...

    someOtherName:
      priority: 1
      rules:
       - apiVersion: projectcalico.org/v3
... rules that come later ...

    lastSetOfRules:
      priority: 9
      rules:
       - apiVersion: projectcalico.org/v3
... rules that come last ... maybe hostendpoints ...

By having named groups of rules each with it's own priority you can
update, delete and amend individual sets of rules without provided you
set the appropriate "priority" value.

Change-Id: Id441350bcc8b95a91ef4d1b89d1bc3c417f50b13
2018-10-22 18:49:18 +00:00
Jean-Philippe Evrard
e7f21a6bd0 Remove dependency to OSH repo
This removes yet another time the dependency towards OSH repo.
With each repository independant, we can later introduce abstract
jobs that will be re-usable but with a clean dependency map: only
bring jobs from one single location, openstack-helm-infra.

Change-Id: I72844a944cfea5380de25dbd7cf7231c8d39f4ec
2018-10-22 10:50:02 +02:00
Matthew Heler
154fcd894f Use the correct socket file for the Ceph mon check.
Change-Id: If8c40c3c0501b78db88d3a7f33bf3838c0e60199
Closes-Bug: 1796313
2018-10-22 04:56:13 +00:00
Chris Wedgwood
d4ac063163 [gate] allow pip caching
The pip cache is useful for repeat operations and doesn't seem to have
any real downsides.

Change-Id: Iadb21a118f8d725911a9baa6a9264b8644012af9
2018-10-22 00:11:25 +00:00
Chris Wedgwood
c08c78f1d1 [gate] Put nfs-provisioner in it's own namespace (docker-registry)
Use the 'docker-nfs' namespace to back the docker registry.  This
means we can delete the registry namespace without causing IO lockups.

Change-Id: I1706dd96653598dcfbb81904fde8c0bf92294b06
2018-10-21 23:42:20 +00:00
Chris Wedgwood
8f5aaa3fd0 [gate] Put nfs-provisioner in it's own namespace
Having storage (backend) components in their own namespace means we
can delete the namespaces containing the openstack without causing
system hangs which occur when storage is remove whilst in use.

Change-Id: Ie489709b08929f25cf0e626a8541620a06506b8b
2018-10-21 23:37:56 +00:00