openstack-helm-infra

Author	SHA1	Message	Date
Anderson, Craig (ca846m)	48a0c09fea	Truncate long host names for overrides Long hostnames can cause the 63 char name limit to be exceeded. Truncate the hostname if hostname > 20 char. Change-Id: Ieb7e4dafb41d1fe3ab3d663d2614f75c814afee6	2018-11-26 17:04:58 -08:00
Zuul	0730df5973	Merge "Prometheus: Add session affinity to ingress"	2018-11-26 18:21:14 +00:00
Zuul	4b76f8c280	Merge "Nagios: Update image tag"	2018-11-26 17:40:20 +00:00
Steve Wilkerson	71c1a16758	Prometheus: Add session affinity to ingress This adds session affinity to Prometheus's ingress. This allows for the use of cookies for Prometheus's session affinity Change-Id: I2e7e1d1b5120c1fb3ddecb5883845e46d61273de	2018-11-26 14:30:08 +00:00
Steve Wilkerson	439079693d	Nagios: Update image tag This updates the Nagios image tag to include the updated plugin for querying Elasticsearch for alerting on logged events Change-Id: Idd61d82463b79baab0e94c20b32da1dc6a8b3634	2018-11-26 08:29:22 -06:00
Zuul	8e369d2c9c	Merge "Ingress: Update version of ingress controller image"	2018-11-23 20:39:38 +00:00
Zuul	89b651dc1d	Merge "Ingress: Make healthz port configurable"	2018-11-21 20:01:26 +00:00
Pete Birley	4d2085f0af	Ingress: Update version of ingress controller image This PS updates the version of the ingress controller image used. This brings in the ability to update the ingress configuration without reloading nginx. There may also need to be some changes for prom based monitoring: * https://github.com/kubernetes/ingress-nginx/blob/master/Changelog.md#0100 Change-Id: Ia0bf3dbb9b726f3a5cfb1f95d7ede456af13374a Signed-off-by: Pete Birley <pete@port.direct>	2018-11-21 19:21:40 +00:00
Zuul	16072765bf	Merge "Ingress: Allow status port to be customised"	2018-11-20 18:29:16 +00:00
Pete Birley	ea875b1dcc	Ingress: Make healthz port configurable This PS updates the healthz port to be configurable Change-Id: Ifa5ea4b7b422156a7309886ecc21668fc096065b Signed-off-by: Pete Birley <pete@port.direct>	2018-11-20 12:28:14 -06:00
Pete Birley	f3e1fa4e72	Ingress: Allow status port to be customised This PS updates the ingress chart to allow the status pport to be changed. Change-Id: Ia38223c56806f6113622a809e792b4fedd010d87 Signed-off-by: Pete Birley <pete@port.direct>	2018-11-20 09:57:56 -06:00
Matthew Heler	5ce9f2eb3b	Enable Ceph charts to be rack aware for CRUSH Add support for a rack level CRUSH map. Rack level CRUSH support is enabled by using the "rack_replicated_rule" crush rule. Change-Id: I4df224f2821872faa2eddec2120832e9a22f4a7c	2018-11-20 09:07:36 -06:00
Zuul	5d356f9265	Merge "Document howto recover from a Ceph namspace deletion"	2018-11-15 17:27:45 +00:00
Matthew Heler	cfc2d4abd8	Document howto recover from a Ceph namspace deletion Change-Id: Ib1b03cd046fbdad6f18478cfa9c9f0bf70ec9430	2018-11-14 13:31:16 -06:00
Zuul	dd6b2a0a1d	Merge "Additional Ceph RGW tuning and cleanups"	2018-11-14 18:48:36 +00:00
Zuul	5bf9c26bd8	Merge "Move default CEPH journal size from 5GB to 10GB"	2018-11-13 05:28:45 +00:00
Matthew Heler	225b85eb5f	Additional Ceph RGW tuning and cleanups Set RGW rados handles from 1 to 4 Remove support for fastcgi (it's no longer supported) Change-Id: Ie260a3e1e5eab2065ec6a4d0637c144965a4214d	2018-11-12 20:13:33 +00:00
Zuul	2640e7422d	Merge "This fixes host-specific overrides"	2018-11-10 02:37:53 +00:00
Zuul	2c9ff8bee8	Merge "Fix the checkPGs cronjob"	2018-11-09 22:57:50 +00:00
Ian Howell	9b132225c6	This fixes host-specific overrides This properly assigns k8s secrets to volumes, rather than using configMaps Change-Id: Ifcabd3565fb2abee063f5da117d83ac3a5602536	2018-11-09 16:24:03 -06:00
Steve Wilkerson	dfb4654fba	Nagios: Configuration updates This moves to update the host used for the ceph health checks, as we should be checking the ceph-mgr service directly for ceph metrics instead of trying to curl the host directly. This also changes the ceph_health_check to use the base-os hostgroup instead of the placeholder ceph-mgr host group, as we're just executing a simple check against the ceph-mgr service. This also adds default configuration values for the max_concurrent_checks (60) and check_workers (4) values instead of leaving them at the defaults Nagios uses (0 and # cores, respectively) Change-Id: Ib4072fcd545d8c05d5e9e4a93085a8330be6dfe0	2018-11-09 13:28:50 -06:00
Steve Wilkerson	325b3cea4d	Nagios: Update host check mechanism This updates the Nagios image to use a tag that includes a fix for the service discovery mechanism used for updating host checks. After moving the Nagios chart to either run in shared or host PID namespaces, the service discovery mechanism no longer worked due to the plugin attempting to restart PID 1 instead of determining the appropriate PID to restart. For reference, see: https://review.gerrithub.io/#/c/att-comdev/nagios/+/432205/ Change-Id: Ie01c3a93dd109a9dc99cfac5d27991583546605a	2018-11-09 09:12:16 -06:00
Zuul	b55e9b10a7	Merge "Nagios: Add session affinity to ingress"	2018-11-09 04:45:36 +00:00
Zuul	98c9b148f3	Merge "Nagios: Update ceph_health check"	2018-11-09 03:24:23 +00:00
Steve Wilkerson	2c6aa8ad1b	Nagios: Add session affinity to ingress This adds session affinity to Nagios's ingress. This allows for the use of cookies for Nagios's session affinity Change-Id: I6054a92f644dc533dd06d35a2541fb44d46cba88	2018-11-09 02:07:39 +00:00
Zuul	a90ebb784c	Merge "Prometheus: Update discovery configuration for ceph-mgr services"	2018-11-09 01:01:54 +00:00
Zuul	77772547e2	Merge "RGW: Fix multinode deploy for ceph rgw"	2018-11-08 22:54:01 +00:00
Zuul	d530635348	Merge "Do not use OSH_INFRA_PATH in osh-infra"	2018-11-08 22:54:00 +00:00
Meg Heisler	774e0cb654	RGW: Fix multinode deploy for ceph rgw Change deployment script for rgw to not use the docker bridge for public and cluster network overrides. Instead, calculate network values in same way as other ceph multinodes deployment steps Change-Id: I2bacd1af1cc331d76a5d61f3b589ca6ef80b1b2e	2018-11-08 11:39:23 -06:00
Matthew Heler	55446e1f41	Move default CEPH journal size from 5GB to 10GB Request from downstream to use 10GB journal sizes. Currently journals are created manually today, but there is upcoming work to have the journals created by the Helm charts themselves. This value needs to be put in as a default to ensure journals are sized appropiately. Change-Id: Idaf46fac159ffc49063cee1628c63d5bd42b4bc6	2018-11-08 17:34:12 +00:00
Zuul	7274c5f95f	Merge "Revert "Fix rally deployment config to rally 1.2.0""	2018-11-07 22:26:22 +00:00
Zuul	47d49bcfd4	Merge "prometheus ceph.rules changes"	2018-11-07 20:51:42 +00:00
Pete Birley	b7e77dfea0	Revert "Fix rally deployment config to rally 1.2.0" This reverts commit 5c2859c3e9026e464bf0c35b591aaae810ff2a1c. This commit breaks the ability to declare users to use with rally/helm test - and needs to be refactored to match the commit message's intent. Change-Id: I2bc66ef40694c277058b4324b8a3528f4f25d1d1	2018-11-07 19:31:49 +00:00
Zuul	b28aed8331	Merge "Fix rally deployment config to rally 1.2.0"	2018-11-07 14:12:32 +00:00
Matthew Heler	e1c82f3465	Fix the checkPGs cronjob Currently the cronjob is broken due to syntax and permission issues. Additionally move the cronjob from once a month to every 15 minutes, and automatically disable the job unless explicitly enabled. Change-Id: Id72bdb286c805ccb0ea4e9fcf65fabca94a180dd	2018-11-06 19:39:23 -06:00
Steve Wilkerson	ba22b0e726	Nagios: Update ceph_health check The ceph_health check in Nagios incorrectly sets the warning and error level to 0. The ceph_health_status metric's value of 0 indicates the cluster is healthy, while 1 indicates a warning and 2 indicates an error state. The Nagios check for ceph_health is updated to reflect these values Change-Id: Iffe80f1c34f6edee6370dd7e707e5f55f83f1ec1	2018-11-06 14:51:40 -06:00
Steve Wilkerson	e0f2d66ee3	Prometheus: Update discovery configuration for ceph-mgr services This updates the Prometheus scrape configuration to use the service based discovery mechanism instead of endpoints. This removes issues associated with multiple ceph-mgr replicas deployed Change-Id: I2c557af0c7200d0c4aea646c5f9ecd1a070db33e	2018-11-06 13:56:37 -06:00
Jean-Philippe Evrard	ff1f75fc45	Do not use OSH_INFRA_PATH in osh-infra If OSH_INFRA_PATH is never used in the openstack-helm-infra repository, as all the references are using relative paths. The keystone script is not using a relative path, and relies on OSH_INFRA_PATH to be defined to work. This is a problem, because when it is not defined, the expected path for ldap chart is /ldap, which is an incorrect path. This fixes the problem by ensuring the path is relative. Change-Id: I04a8d5c074b7c1e6fa66617bbb907f2ad4dcb3af	2018-11-05 13:36:03 +00:00
Zuul	fca344900f	Merge "Enable the mgr balancer module by default."	2018-11-02 22:36:13 +00:00
Steve Wilkerson	69196031cd	Nagios: Ensure processes are reaped This moves Nagios to run as child processes of either the pause container or use the hosts init system (for k8s <1.10) to prevent defunct process sprawl Change-Id: I6a93d446577674b0b012f9567d5e6a5794ebc44b	2018-11-02 08:12:24 -05:00
Matthew Heler	a79562a28b	Enable the mgr balancer module by default. The balancer module will distribute PGs more evenly across OSDs. While CRUSH does a good job at this, it is not perfect and hot spots (where an OSD has more PGs then it's peers) can occur. Change-Id: Ic45a6bf745bdd09a3f5782e9e8bda89c3d3da2aa	2018-11-01 15:52:51 +00:00
inspurericzhang	f1c2bf976f	[Trivial Fix] modify spelling error of "resource" Although it is spelling mistakes, it affects reading. Change-Id: I75a1f66002ec46fe206f31fec02fbd47f9cee443	2018-11-01 09:52:04 +08:00
kranthi guttikonda	fac358a575	prometheus ceph.rules changes With new ceph luminous ceph.rules are obsolete. Added a new rule for ceph-mgr count Changed ceph_monitor_quorum_count to ceph_mon_quorum_count Updated ceph_cluster_usage_highas ceph_cluster_used_bytes, ceph_cluster_capacity_bytes aren't valid Updated ceph_placement_group_degrade_pct_high as ceph_degraded_pgs, ceph_total_pgs aren't valid Updated ceph_osd_down_pct_high as ceph_osds_down, ceph_osds_up aren't available, ceph_osd_up is available but ceph_osd_down isn't. Need to calculate the down based on count(ceph_osd_up==0) and total osd using count(ceph_osd_metadata) Removed ceph_monitor_clock_skew_high as the metric ceph_monitor_clock_skew_seconds isn't valid anymore Added new alarms ceph_osd_down, ceph_osd_out Implements: prometheus ceph.rules changes with new valid metrics Closes-Bug: #1800548 Change-Id: Id68e64472af12e8dadffa61373c18bbb82df96a3 Signed-off-by: Kranthi Guttikonda <kranthi.guttikonda@b-yond.com>	2018-10-31 10:23:11 -04:00
Matthew Heler	3e7ba37290	Ensure latest Ceph packages during deployment Change-Id: Ia5bc0802577e2b72a1de078085f5fe7e60f63604	2018-10-31 02:16:50 -05:00
Tin Lam	5730631ba6	Clean-up script This patch set cleans up the script to be consistent with other OSH installation scripts. Change-Id: I212cd0cf0e818f1fc924b9b690d18f5d107b850b Signed-off-by: Tin Lam <tin@irrational.io>	2018-10-30 16:22:45 +00:00
Zuul	31a9bb6ad4	Merge "[gate] Use Kubernetes 1.10.9"	2018-10-30 08:05:08 +00:00
Steve Wilkerson	45da8c2b69	Ceph: Update log directory host mount path This updates the ceph-mon and ceph-osd charts to use the release name for the hostpath defined for mounting the /var/log/ceph directories to. This gives us a mechanism for creating unique log directories for multiple releases of the same chart without the need for specifying an override for each deployment of that chart Change-Id: Ie6e05b99c32f24440fbade02d59c7bb14d8aa4c8	2018-10-29 13:05:46 -05:00
Chris Wedgwood	b10ebbb63a	[gate] Use Kubernetes 1.10.9 Change-Id: I5bb951f455fa6d7d344a264336a2a9b985fd85f4	2018-10-29 15:10:35 +00:00
Matthew Heler	6ef48d3706	Further performance tuning changes for Ceph - Throttle down snap trimming as to lessen it's performance impact (Setting just osd_snap_trim_priority isn't effective enough to throttle down the impact) osd_snap_trim_sleep: 0.1 (default 0) osd_pg_max_concurrent_snap_trims: 1 (default 2) - Align filestore_merge_threshold with upstream Ceph values (A negative number disables this function, no change in behavior) filestore_merge_threshold: -10 (formerly -50, default 10) - Increase RGW pool thread size for more concurrent connections rgw_thread_pool_size: 512 (default 100) - Disable in-memory logs for the ms subsytem. debug_ms: 0/0 (default 0/5) - Formating cleanups Change-Id: I4aefcb6e774cb3e1252e52ca6003cec495556467	2018-10-26 15:10:50 +00:00
Zuul	62f49e7c74	Merge "Define OSH_PATH by default"	2018-10-26 11:35:11 +00:00

1 2 3 4 5 ...

904 Commits