The ceph_health check in Nagios incorrectly sets the warning and
error level to 0. The ceph_health_status metric's value of 0
indicates the cluster is healthy, while 1 indicates a warning and
2 indicates an error state. The Nagios check for ceph_health is
updated to reflect these values
Change-Id: Iffe80f1c34f6edee6370dd7e707e5f55f83f1ec1
This moves Nagios to run as child processes of either
the pause container or use the hosts init system (for k8s <1.10)
to prevent defunct process sprawl
Change-Id: I6a93d446577674b0b012f9567d5e6a5794ebc44b
The balancer module will distribute PGs more evenly across OSDs.
While CRUSH does a good job at this, it is not perfect and hot spots
(where an OSD has more PGs then it's peers) can occur.
Change-Id: Ic45a6bf745bdd09a3f5782e9e8bda89c3d3da2aa
This patch set cleans up the script to be consistent with other OSH
installation scripts.
Change-Id: I212cd0cf0e818f1fc924b9b690d18f5d107b850b
Signed-off-by: Tin Lam <tin@irrational.io>
This updates the ceph-mon and ceph-osd charts to use the release
name for the hostpath defined for mounting the /var/log/ceph
directories to. This gives us a mechanism for creating unique log
directories for multiple releases of the same chart without the
need for specifying an override for each deployment of that chart
Change-Id: Ie6e05b99c32f24440fbade02d59c7bb14d8aa4c8
- Throttle down snap trimming as to lessen it's performance impact
(Setting just osd_snap_trim_priority isn't effective enough to throttle
down the impact)
osd_snap_trim_sleep: 0.1 (default 0)
osd_pg_max_concurrent_snap_trims: 1 (default 2)
- Align filestore_merge_threshold with upstream Ceph values
(A negative number disables this function, no change in behavior)
filestore_merge_threshold: -10 (formerly -50, default 10)
- Increase RGW pool thread size for more concurrent connections
rgw_thread_pool_size: 512 (default 100)
- Disable in-memory logs for the ms subsytem.
debug_ms: 0/0 (default 0/5)
- Formating cleanups
Change-Id: I4aefcb6e774cb3e1252e52ca6003cec495556467
This PS allows to set collectors enable/disable using values.
_node-exporter.sh.tpl makes collectors-list from values.yaml.
Change-Id: Iba2cf4d8304f2405db394fbb6fee58119eab13fc
OSH_PATH is not defined by default outside OpenStack's CI.
This is a problem if a user wants to run scripts manually on its
machine for local testing.
This fixes it by having, by default, the OSH_PATH defined
in the scripts using OSH relatively to current folder.
For user experience, the script returns to the same path after
running.
Change-Id: I915e7d3c945f2002a2008b2b033a2b7725320b17
this is make log directory configurable incase if another mon or
osd running on same host can point to other directory
Change-Id: I2db6dffd45599386f8082db8f893c799d139aba3
This PS updates the MariaDB chart to better support clustering,
using a configmap to track cluster state.
Change-Id: Ifd9c3d63353a9b587384b6f13c0863ecc4fbd956
Signed-off-by: Pete Birley <pete@port.direct>
This removes the checks for Nagios to query Elasticsearch for
logged events. The current plugin in the image is resulting in
unstable behavior, and should be removed until this plugins been
improved
Change-Id: If1bdd954956f063ac1eebbb94d1128df8b8d2695
This patch set addresses a cross-repo conflict with the enablement of
network policy in gate script override.
Change-Id: I284d6b04940424a87e5b239ccc9d30ae01075f38
Signed-off-by: Tin Lam <tin@irrational.io>
This PS updates the mgr check to allow use on hosts with fqdns
defined.
Change-Id: If1cb740e8093fbcafce846234c96db931409b436
Signed-off-by: Pete Birley <pete@port.direct>
Updates the helm dep up command to use the $(HELM) variable instead of
the locally istalled helm for the host machine. This bring this line of
code in alignment with the other uses of helm in the same Makefile.
Change-Id: I91bfdceedd3bac0ac49daf5b9410c05e0e840168
Allow Calico resources such as NetworkPolicy, GlobalNetworkPolicy,
WorkloadEndpoint, etc to be specified using values.
To avoid the complexities of list management with helm we use a
dictionary that contains a relative priority and set of objects
(called rules).
For example:
network:
policy:
someName:
priority: 0
rules:
- apiVersion: projectcalico.org/v3
... some useful resource object ...
- apiVersion: projectcalico.org/v3
... some other useful resource object ...
someOtherName:
priority: 1
rules:
- apiVersion: projectcalico.org/v3
... rules that come later ...
lastSetOfRules:
priority: 9
rules:
- apiVersion: projectcalico.org/v3
... rules that come last ... maybe hostendpoints ...
By having named groups of rules each with it's own priority you can
update, delete and amend individual sets of rules without provided you
set the appropriate "priority" value.
Change-Id: Id441350bcc8b95a91ef4d1b89d1bc3c417f50b13
This removes yet another time the dependency towards OSH repo.
With each repository independant, we can later introduce abstract
jobs that will be re-usable but with a clean dependency map: only
bring jobs from one single location, openstack-helm-infra.
Change-Id: I72844a944cfea5380de25dbd7cf7231c8d39f4ec
Use the 'docker-nfs' namespace to back the docker registry. This
means we can delete the registry namespace without causing IO lockups.
Change-Id: I1706dd96653598dcfbb81904fde8c0bf92294b06
Having storage (backend) components in their own namespace means we
can delete the namespaces containing the openstack without causing
system hangs which occur when storage is remove whilst in use.
Change-Id: Ie489709b08929f25cf0e626a8541620a06506b8b
this script will create an object and see if the object is
getting replicated across diffrent hosts or not.
Change-Id: Ic5056c1a07dc5d5b6a5d6fc24e3d9a75fa46458f
By default use rbd-nbd (librbd) instead of krbd.
Applying this change on existing nodes will
require reboots.
Change-Id: I81829fb8666541e856ab402128a5192984b6fe05